cfsearch with a category restriction

ColdFusion’s Verity is difficult to figure out. There’s hardly any documentation for it online and there are a number of quirky little issues that cause a lot of frustration. Today I was trying to do what seemed to be rather simple, at least according to the Adobe docs: searching a collection that contains categories.

I have a table of articles, a table of categories, and a relational table joining the two. It’s a many-many relationship (an article may have multiple categories and a category may have multiple articles). I also have a collection created in CF Administrator named articles_index with the categories option enabled.

With the following code I queried and indexed the data from my articles table, and included a comma-list (generated by a MS SQL UDF) of custom categories in the cfindex’s category param, see below:

<cfquery name="IndexArticles" datasource="webdsn">
     SELECT *,
     dbo.CategoryIDList(articleID) CatIDList
     FROM dbo.Articles
</cfquery>

<cfindex
query="IndexArticles"
collection="articles_index"
action="refresh"
type="Custom"
key="articleID"
title="articleID"
body="articleName,articleAuthor,articlePubDate,articleDescription,articleKeywords"
category="CatIDList">

<h3>indexing complete</h3>

And if you’re curious, this is how my UDF looks in MS SQL that created the CatIDList (called in the IndexArticles SELECT statement above):

CREATE FUNCTION dbo.CategoryIDList(@articleID char(35))
RETURNS VARCHAR(1000) AS

BEGIN
DECLARE @CatIDList varchar(1000)

SELECT @CatIDList = COALESCE(@CatIDList + ',', '') + RTRIM(articleID)

FROM
dbo.Category

INNER JOIN dbo.Document ON clocatID = clocatdocCategory
INNER JOIN dbo.articleument ON clocatdocDocument = articleID

WHERE articleID = @articleID

RETURN @CatIDList
END

This worked fine, and I could dump the collection on screen and see that the category field was being correctly populated. But when I tried to search this collection with a category restriction, no results would return — even though entries in the collection had corresponding categories listed.

For example, running a cfsearch such as the one below would return 0 results even though I had an article in the system with the category=”law”:

<cfsearch name="GetResults" collection="articles_index" criteria="*" category="law" >

I looked around online and found a site that suggested deleting the collection, restarting the ColdFusion Application Service on the server, and then recreating the collection, not using CF Administrator, but using the cfcollection tag like so:

<cfcollection action="create" collection="articles_index" path="C:\inetpub\collections\" language="english" categories="yes">

After re-creating the collection, I re-indexed my tables and to my surprise I could actually get search results to return. The cfsearch would produce results as long as the category I was restricting by was either the only category an article possessed, or was the first of the series of categories the article possessed.

So if I had an article that was in both law and health categories, ‘law,health’ in the category field in the index, restricting by the category health would not return the result, but restricting by the category law would. This is not acceptable because I need to search by any variety of categories that might be in any order.

I spent a lot more time on Google and could not find a solution to the multiple categories problem, so I came up with my own logic.

This is basically how it works:

  1. Run a search on the collection, based on a user’s criteria
  2. Loop through these results
    1. Loop through each result’s category field
    2. Check the category list to see if it matches the restricted category (that I am passing via the variable ‘cid’)
    3. Collect all the primary keys of our articles that meet this requirement into a list called SecondResults
  3. Loop through the SecondResults list
    1. Add each list entry to a new list named ThirdResults, appropriately formatted to be used as a variable in a ‘WHERE IN’ SQL statement
  4. Query all of the articles that are in the ThirdResults list in a query named FinalResults
  5. FinalResults will contain the results you are looking for, based on criteria and a restricted category

See the code below:

<!--- Set default params --->
<cfparam name="SecondResults" default="">
<cfparam name="ThirdResults" default="">
<cfparam name="FinalResults" default="">

<!--- The user has submitted criteria --->
<cfif criteria NEQ ''>

     <!--- Search our collection --->
     <cfsearch name="FirstResults" collection="articles_index" criteria="#criteria#" >

     <!--- Loop through the collection results --->
     <cfloop query="FirstResults">

          <!--- Check if category restriction exists (I pass cid to this page through the URL) --->
          <cfif cid NEQ ''>
               <!--- If we are using a category restriction, collect PK's for entries that apply --->
               <cfloop list="#FirstResults.Category#" index="i">
                    <cfif i EQ cid>
                         <cfset SecondResults = SecondResults & "," & FirstResults.Key>
                    </cfif>
               </cfloop>
          <cfelse>
               <!--- Else collect all the PK's --->
               <cfset SecondResults = SecondResults & "," & FirstResults.Key>
          </cfif>

     </cfloop>

     <!--- Create a SQL friendly list for the WHERE IN clause --->
     <!--- Variable for flagging if this is the first pass of the loop --->
     <cfset count = 0>
     <cfloop list="#SecondResults#" index="i">
          <cfif count EQ 0>
               <!--- If this is the first pass of the loop, add the first item without a preceding comma --->
               <cfset ThirdResults = "'" & i & "'">
               <cfset count = count + 1>
          <cfelse>
               <!--- If this is not the first pass of the loop, use a preceding comma --->
               <cfset ThirdResults = ThirdResults & ",'" & i & "'">
          </cfif>
     </cfloop>

     <!--- Query our final version of the results --->
     <cfquery name="FinalResults" datasource="webdsn">
          SELECT *,
          dbo.CategoryIDList(articleID) CatIDList
          FROM dbo.Articles
          WHERE articleID IN (<cfoutput>#PreserveSingleQuotes(ThirdResults)#</cfoutput>)
     </cfquery>

</cfif>

FinalResults will be the variable you want to cfloop query to output your results for the user.

It works but it’s not pretty. The code is redundant, and as such will take longer to execute. Unfortunately this was the only way I could think of to address the issue of searching entries that have multiple categories.

I hope this helps someone! If you have found a better way of doing this or have any questions, please let me know.

Leave a Reply