16 minutes reading time (3106 words)

The anatomy of smart search in Joomla 5. Creating a plugin Part 3

The anatomy of smart search

In the previous article, we have startet to code our custom plugin for Joomla smart search component. So, let's finish it!

The index() method

This method should adapt the data obtained from the database in order to submit it for indexing. As an argument, the $item element (article, product, tag, etc.) is passed to it as an instance class of the \Joomla\Component\Finder\Administrator\Indexer\Result class. The properties of $item match those that we selected from the database. The ultimate goal of this method is to call $this->indexer->index($item).

You need to understand what the search results will look like in order to understand what to compare with what.

Joomla 5 smart search result structure

Here I take the names straight from the code. In this case, the images are not shown in the search results, but they are also there - $this->result->imageUrl.

  • imageUrl  - a picture of the Joomla article, product, tag, contact. It is displayed if enabled in the settings of the smart search component.
  • title - the title of the material, the name of the product, contact, etc.
  • description and body - a text description. We remember that many entities in Joomla have a short and complete description or an introductory and full text. Here they are combined and then trimmed to the character limit specified in the settings. body is the full text or description.
  • getTaxonomy() - the method receives and outputs taxonomy data for a given search result.

Thus, we have few data visible to the user - only 4 types. And we get more data from the database. We need to understand which of them will be available for indexing by search, and which will only be displayed.

Here is the code of the index() method with comments.

<?php
/**
 * Method to index an item. The item must be a Result object.
 *
 * @param   Result  $item  The item to index as a Result object.
 *
 * @return  void
 *     
 * @throws  \Exception on database error.
 * @since   2.5
 */
protected function index(Result $item)
{
	// Setting the language of the indexed element - the default language of the site
	$item->setLanguage();
	// Check if JoomShopping is enabled in Joomla.
	if (ComponentHelper::isEnabled($this->extension) === false)
	{
		return;
	}
	// We take part of the paths to the pictures from the component parameters.
	$this->loadJshopConfig();
	// Setting the context for indexing
	$item->context = 'com_jshopping.product';

	// We collect all the parameters in a bunch: the search component, JoomShopping and the site.
	// They will be available to us in the output layout
	$registry     = new Registry($item->params);
	$item->params = clone ComponentHelper::getParams('com_jshopping', true);
	$item->params->merge($registry);
	$item->params->merge((new Registry($this->jshopConfig)));
	// Meta-data: meta-keywords, meta description, author,
	// index / no-index values for robots
	$item->metadata = new Registry($item->metadata);

	// Process the content with content plugins - the onContentPrepare event.
	// Content plugins always have a context check.
	// If it is equal to 'com_finder.indexer', then content plugins will usually not work.
	// ONLY TEXT should be given for indexing.
	// Neither pictures nor YouTube videos should get there, so
	// the indexed content will be cleared of HTML tags.
	// The raw short codes are just text and can get
	// into the search results.

	$item->summary = Helper::prepareContent($item->summary, $item->params, $item);
	$item->body    = Helper::prepareContent($item->body, $item->params, $item);
	// Include here JoomShopping classed.
	require_once JPATH_SITE . '/components/com_jshopping/bootstrap.php';
	\JSFactory::loadAdminLanguageFile($this->languageTag, true);

	//
	// A trick. We want to be able to search by price, product code, etc too.
	// Attaching all this data to the body.
	//

	// Manufacturer code
	$manufacturer_code = $item->getElement('manufacturer_code');
	if (!empty($manufacturer_code))
	{
		$item->body .= ' ' . Text::_('JSHOP_MANUFACTURER_CODE') . ': ' . $manufacturer_code;
	}

	// EAN
	$product_ean = $item->getElement('product_ean');
	if (!empty($product_ean))
	{
		$item->body .= ' ' . Text::_('JSHOP_EAN') . ': ' . $product_ean;
	}

	// Old price
	$product_old_price = (float) $item->getElement('product_old_price');
	if (!empty($product_old_price))
	{
		$product_old_price = \JSHElper::formatPrice($item->getElement('product_old_price'));
		$item->body        .= ' ' . Text::_('JSHOP_OLD_PRICE') . ': ' . $product_old_price;
	}

	// Buy price
	$product_buy_price = (float) $item->getElement('product_buy_price');
	if (!empty($product_buy_price))
	{
		$product_buy_price = \JSHElper::formatPrice($item->getElement('product_buy_price'));
		$item->body        .= ' ' . Text::_('JSHOP_PRODUCT_BUY_PRICE') . ': ' . $product_buy_price;
	}
	// Price
	$product_price = (float) $item->getElement('product_price');
	if (!empty($product_price))
	{
		$product_price = \JSHElper::formatPrice($product_price);
		$item->body    .= ' ' . Text::_('JSHOP_PRODUCT_PRICE') . ': ' . $product_price;
	}


	// URL - the unique key of the element in the table. Read more about this after the sample code
	$item->url = $this->getUrl($item->slug, 'com_jshopping', $item->catslug);

	// Link to element - to JoomShopping product
	$item->route = $item->url;

	// A menu item can be created for product. Menu item has own page header.
	// Joomla menu has higher priority then we'll take data from it.
	$title = $this->getItemMenuTitle($item->url);

	// Adjust the title if necessary.
	if (!empty($title) && $this->params->get('use_menu_title', true))
	{
		$item->title = $title;
	}

	// Add product image
	$product_image = $item->getElement('image');
	if (!empty($product_image))
	{
		$item->imageUrl = $item->params->get('image_product_live_path') . '/' . $product_image;
		$item->imageAlt = $item->title;
	}

	// The product has no author. But you can do this if you search by author/user
	// For example, you enabled search by author in the smart search component settings
	// and it should search for everything related to this author
	// $item->metaauthor = $item->metadata->get('author');

	// Add the metadata processing instructions.
	$item->addInstruction(Indexer::META_CONTEXT, 'metakey');
	$item->addInstruction(Indexer::META_CONTEXT, 'metadesc');
	// $item->addInstruction(Indexer::META_COTNTEXT, 'metaauthor');
	// $item->addInstruction(Indexer::META_CONTEXT, 'author');
	// $item->addInstruction(Indexer::META_CONTEXT, 'created_by_alias');

	// Access group for the default search result.
	// We hardcode "1" - that is, for everyone. But here you can
	// take the access group from the product.
	// Or show different search results to different access groups.
	$item->access = 1;

	// Check if the category for the product is published.
	// Products should only be published if their category is published.
	$item->state = $this->translateState($item->state, $item->cat_state);

	// Get the list of taxonomies to display from the plugin parameters
	$taxonomies = $this->params->get('taxonomies', ['product', 'category', 'language']);
	// Name of the search type in the drop-down list of types: materials, contacts.
	// In our case - product. We take a language constant for this.
	$item->addTaxonomy('Type', Text::_('JSHOP_PRODUCT'));
	// Add product categories to the drop-down list
	// categories so that you can search only in a specific
	// categories. Here we already transfer the names of the categories.
	$item->addTaxonomy('Category', $item->category);
	// Search only in the desired language
	$item->addTaxonomy('Language', $this->getLangTag());
	// Search results can be limited by publication start and end dates
	$item->publish_start_date = Factory::getDate()->toSql();
	$item->start_date         = Factory::getDate()->toSql();

	// Add additional data for the indexed element.
	// Here in the helper the "onPrepareFinderContent" event is called
	// This is usually how comments, tags, labels
	// and other things that should be available for search are added.
	// Accordingly, individual plugins work with this event.
	// In our case, we do not need this yet.
	// Helper::getContentExtras($item);
	
	// Add custom fields (com_fields) Joomla, if the component
	// supports them.
	// In our case, we do not need this yet.
	// Helper::addCustomFields($item, 'com_jshopping.product');

	// Index the item.
	$this->indexer->index($item);
}

Types of instructions for "marking up the weight" of content for indexing.

I am not an expert in fine-tuning indexing, so I will try to describe what I could see in the code. In the parameters of the smart search component, there are weight settings for each part of the indexed content: title, main text, metadata, url, additional text.

Joomla 5 smart search params - indexing weight settings for title, body, metadata

We have already seen these settings above in the first article in seria, but for convenience we will repeat the screenshot.

In our smart search plugin, we can specify which data in our object for indexing belongs to which type by adding instructions:

<?php
// In the index() method
$item->addInstruction(Indexer::TEXT_CONTEXT, 'product_buy_price');

We look at the types of context for instructions and their names by default in the class \Joomla\Component\Finder\Administrator\Indexer\Result.

Joomla smart search indexing instructions types for weight markup

As I later found out, list_price and sale_price refer to indexing, not to the terminology of an online store.

getUrl() method

The unique key for searching is essentially the url of the element in its system form: index.php?option=com_content&view=article&id=1. In the database in the #__finder_links table, it is stored in the url column. But to build a link in the frontend to the desired element from the search results, a more complex option is used with a combination of id and aliases in the url: index.php?option=com_content&view=article&id=1:article-alias&catid=2, which is stored in the adjacent route column. But Joomla routing will determine the final url without specifying an alias, in which case the content of the url and route will be the same.

<?php
// Fragment of index() method of smart search plugin for Joomla materials

// Create a URL as identifier to recognise items again.
$item->url = $this->getUrl($item->id, $this->extension, $this->layout);

// Build the necessary route and path information.
$item->route = RouteHelper::getArticleRoute($item->slug, $item->catid, $item->language);

The system url for the indexed element looks different in different components. In the components that follow the "Joomla way", you can use one controller, which, if no specific controller is found, will immediately show the desired View. Therefore, in standard Joomla components, we usually do not find links indicating the controller in the GET parameters. They will all look like index.php?option=com_content&view=article&id=15. This is the url that the getUrl() method of the Adapter class returns to us.

<?php
/**
 * Method to get the URL for the item. The URL is how we look up the link
 * in the Finder index.
 *
 * @param   integer  $id         The id of the item.
 * @param   string   $extension  The extension the category is in.
 * @param   string   $view       The view for the URL.
 *
 * @return  string  The URL of the item.
 *
 * @since   2.5
 */
protected function getUrl($id, $extension, $view)
{
    return 'index.php?option=' . $extension . '&view=' . $view . '&id=' . $id;
}

However, JoomShopping has its own story and URLs are built somewhat differently. We will not be able to use the standard method and we are redefining it.

<?php
use Joomla\CMS\Uri\Uri;

/**
 * @param string $product_id  Product id
 * @param string $extension  Always 'com_jshopping'
 * @param string $view  Not used for JoomShopping
 *
 * @return string
 *
 * @since 1.0.0
 */
public function getUrl($product_id, $extension, $view = 'not_used') : string
{
	/**
	 * There is the trick. For JoomShopping product url construction
	 * we need only in product id and category id
	 */
	$this->loadJshopConfig();
	// Keeping in mind the difficulties with categories in JoomShopping
	// we separate the process of getting the category into a separate method.
	$category_id = $this->getProductCategoryId((int)$product_id);
	$url = new Uri();
	$url->setPath('index.php');
	$url->setQuery([
		'option'      => 'com_jshopping',
		'controller'  => 'product',
		'task'        => 'view',
		'category_id' => $category_id,
		'product_id'  => $product_id,
	]);
	// When constructing a url in JoomShopping, it is advisable to find and specify
	// The correct itemId is the id of the menu item for JoomShopping.
	// Otherwise, we may have duplicate pages by url
	// We connected the JoomShopping API earlier, so the JSHelper should already be here.
	$defaultItemid = \JSHelper::getDefaultItemid($url->toString());
	$url->setVar('Itemid', $defaultItemid);

	return $url->toString();
}

An atypical implementation of multilingualism. Indexation.

Let's return to the problem of multilingualism, in which we do not have a separate indexed entity for each content language. But there is an indexed entity that contains values for all languages at once.

The solution is to get a list of all languages inside the index() method, collect the Result object for each language and then give each object for indexing. We need to divide the data into those that are the same for both languages (usually catid, access, etc.) and different (title, description, fulltext etc.). That is, the $this->indexer->index() method will be called several times inside the Index() method of our plugin. 

<?php
/**
 * Method to index an item. The item must be a Result object.
 *
 * @param   Result  $item  The item to index as a Result object.
 *
 * @return  void
 *
 * @throws  \Exception on database error.
 * @since   2.5
 */
protected function index(Result $item)
{

	// Initialise the item parameters.
	$registry     = new Registry($item->params);
	$item->params = clone ComponentHelper::getParams('com_swjprojects', true);
	$item->params->merge($registry);
	$item->context = 'com_swjprojects.project';
	$lang_codes    = LanguageHelper::getLanguages('lang_code');

	$translates = $this->getTranslateProjects($item->id, $item->catid);

	// Translate the state. projects should only be published if the category is published.
	$item->state = $this->translateState($item->state, $item->cat_state);

	// Get taxonomies to display
	$taxonomies = $this->params->get('taxonomies', ['type', 'category', 'language']);

	// Add the type taxonomy data.
	if (\in_array('type', $taxonomies))
	{
		$item->addTaxonomy('Type', 'Project');
	}

	$item->access = 1;
	foreach ($translates as $translate)
	{
		$item->language = $translate->language;
		$item->title    = $translate->title;
		// Trigger the onContentPrepare event.
		$item->summary = Helper::prepareContent($translate->introtext, $item->params, $item);
		$item->body    = Helper::prepareContent($translate->fulltext, $item->params, $item);

		$metadata       = new Registry($translate->metadata);
		$item->metakey  = $metadata->get('keywords', '');
		$item->metadesc = $metadata->get('description', $translate->introtext);
		// Add the metadata processing instructions.
		$item->addInstruction(Indexer::META_CONTEXT, 'metakey');
		$item->addInstruction(Indexer::META_CONTEXT, 'metadesc');

		$lang = '';

		if (Multilanguage::isEnabled())
		{
			foreach ($lang_codes as $lang_code)
			{
				if ($translate->language == $lang_code->lang_code)
				{
					$lang = $lang_code->sef;
				}
			}
		}
		// Create a URL as identifier to recognise items again.
		$item->url = $this->getUrl($item->id, $this->extension, $this->layout, $lang);

		// Build the necessary route and path information.
		$item->route = RouteHelper::getProjectRoute($item->id, $item->catid);

		// Get the menu title if it exists.
		$title = $this->getItemMenuTitle($item->route);

		// Adjust the title if necessary.
		if (!empty($title) && $this->params->get('use_menu_title', true))
		{
			$item->title = $title;
		}

		// Add the category taxonomy data.
		if (\in_array('category', $taxonomies))
		{
			$item->addTaxonomy('Category', $translate->category, 1, 1, $item->language);
		}

		// Add the language taxonomy data.
		if (\in_array('language', $taxonomies))
		{
			$item->addTaxonomy('Language', $item->language, 1, 1, $item->language);
		}


		$item->metadata = new Registry($item->metadata);

		$icon = ImagesHelper::getImage('projects', $item->id, 'icon', $item->language);

		// Add the image.
		if (!empty($icon))
		{
			$item->imageUrl = $icon;
			$item->imageAlt = $item->title;
		}

		// Add the meta author.
		// $item->metaauthor = $item->metadata->get('author');

		// Get content extras.
		// Helper::getContentExtras($item);
		// Helper::addCustomFields($item, 'com_swjprojects.project');

		// Index the item.
		$this->indexer->index($item);
	}
}

We also need to unify the value for the route field for each language, so we make our own implementation of the getUrl() method and add the $lang parameter to the url.

<?php
/**
 * @param   int     $id
 * @param   string  $extension
 * @param   string  $view
 * @param   string  $lang  Language SEF code like `ru`, `en` etc
 *
 * @return string
 *
 * @since 2.1.0
 */
public function getUrl($id, $extension, $view, $lang = '')
{
	$url = 'index.php?option=' . $extension . '&view=' . $view . '&id=' . $id;

	if (!empty($lang))
	{
		$url .= '&lang=' . $lang;
	}

	return $url;
}

getItems() and getContentCount() methods

In general, we don't need anything else for manual indexing and reindexing of content, as well as scheduled via the CLI. However, if we come across some very unusual beast in the form of non-Joomla data, some third-party database, then we can completely redefine the logic of the parent Adapter class for these purposes.

  • getContentCount() - the method should return an integer - the number of indexed elements.
  • getItems($offset, $limit, $query = null) - under the hood, calls getListQuery() and sets $offset and $limit, brings everything to a single view - objects.

If using one getListQuery() method and accessing it from 3 other methods is inconvenient for some reason, you can customize requests and their processing in redefined methods.

Reindexing content on the fly

To solve this problem, we have several ways, two of which are the periodic manual and CRON indexing, as I wrote above. However, this is fraught with an untimely index update and site users may not receive updated data in search results on time. Therefore, there is another way: reindexing the content on the fly, immediately after saving the changes. To do this, a content group plugin is created, which triggers smart search plugin events at the right moments.

Content plugin for joomla smart search

Standard Joomla Model Events

If the component is written according to the canons of Joomla and inherits its classes, then many models (Model - MVC) trigger standard events, among which we are interested in several:

  • onContentBeforeSave - The event is triggered before any Joomla entity is saved.
  • onContentAfterSave - The event is triggered after saving any Joomla entity.
  • onContentAfterDelete - The event is triggered after deleting any Joomla entity.
  • onContentChangeState - the event is triggered after a state change (unpublished/published).
  • onCategoryChangeState - The event is triggered after the category status changes (if the standard Joomla categories component is used).

By default, the smart search plugin causes the content to be reindexed at the listed points in time. In each of these events, the context of the event call is passed in the form of <component>.<entity>, for example, com_content.article or com_menus.menu. According to the desired context, you can determine whether to start reindexing or not. We are already doing this check in the smart search plugin. An example from the code of the Finder content plugin for Joomla articles:

<?php
use Joomla\CMS\Event\Finder as FinderEvent;

/**
 * Smart Search after save content method.
 * Content is passed by reference, but after the save, so no changes will be saved.
 * Method is called right after the content is saved.
 *
 * @param   string  $context  The context of the content passed to the plugin (added in 1.6)
 * @param   object  $article  A \Joomla\CMS\Table\Table\ object
 * @param   bool    $isNew    If the content has just been created
 *
 * @return  void
 *
 * @since   2.5
 */
public function onContentAfterSave($context, $article, $isNew): void
{
    $this->importFinderPlugins();

    // Trigger the onFinderAfterSave event.
    $this->getDispatcher()->dispatch('onFinderAfterSave', new FinderEvent\AfterSaveEvent('onFinderAfterSave', [
        'context' => $context,
        'subject' => $article,
        'isNew'   => $isNew,
    ]));
}

As we can see, the onFinderAfterSave event is called here, which is specific specifically for smart search plugins. And in the onFinderAfterSave() method of our smart search plugin, there is already a check for the desired context and reindexing.

<?php
use Joomla\CMS\Event\Finder as FinderEvent;

/**
 * Smart Search after save content method.
 * Reindexes the link information for an article that has been saved.
 * It also makes adjustments if the access level of an item or the
 * category to which it belongs has changed.
 *
 * @param   FinderEvent\AfterSaveEvent   $event  The event instance.
 *
 * @return  void
 *
 * @since   2.5
 * @throws  \Exception on database error.
 */
public function onFinderAfterSave(FinderEvent\AfterSaveEvent $event): void
{
    $context = $event->getContext();
    $row     = $event->getItem();
    $isNew   = $event->getIsNew();

    // We only want to handle articles here.
    if ($context === 'com_content.article' || $context === 'com_content.form') {
        // Check if the access levels are different.
        if (!$isNew && $this->old_access != $row->access) {
            // Process the change.
            $this->itemAccessChange($row);
        }

        // Reindex the item.
        $this->reindex($row->id);
    }

    // Check for access changes in the category.
    if ($context === 'com_categories.category') {
        // Check if the access levels are different.
        if (!$isNew && $this->old_cataccess != $row->access) {
            $this->categoryAccessChange($row);
        }
    }
}

In the same way, the work is organized when the state changes and the article or product is deleted.

getItem() method

This method gets the indexed element by its id. It is called when reindexing after saving articles, products, etc. - on the onFinderAfterSave event. Internally, it receives an SQL query from the getListQuery() method, adds the id of the requested entity to it, and executes the query. However, in the parent class, the id field with the prefix a is hardcoded for tables - $query->where('a.id = ' . (int) $id). Since in our case both the prefix and the name of the field for the request are different, we redefine the method too.

<?php
// I have indicated here only the namespaces used in the example
use Joomla\Utilities\ArrayHelper;
use Joomla\Component\Finder\Administrator\Indexer\Result;

/**
 * Method to get a content item to index.
 *
 * @param   integer  $id  The id of the content item.
 *
 * @return  Result  A Result object.
 *
 * @throws  \Exception on database error.
 * @since   2.5
 */
protected function getItem($id)
{
	// Get the list query and add the extra WHERE clause.
	$query = $this->getListQuery();
	$query->where('prod.product_id = ' . (int) $id);

	// Get the item to index.
	$this->db->setQuery($query);
	$item = $this->db->loadAssoc();

	// Convert the item to a result object.
	$item = ArrayHelper::toObject((array) $item, Result::class);

	// Set the item type.
	$item->type_id = $this->type_id;

	// Set the item layout.
	$item->layout = $this->layout;

	return $item;
}

Conclusion

This article does not pretend to be a complete description of the mechanics of smart search in Joomla, even after 4 months of work on it. And in this case it is not a metaphor. But I hope this article will help those who will take up writing plugins for indexing data from Joomla components or third-party systems.

I will gratefully accept suggestions for improving the article and additions in the comments.

Some articles published on the Joomla Community Magazine represent the personal opinion or experience of the Author on the specific topic and might not be aligned to the official position of the Joomla Project

Copyright

© Sergey Tolkachyov

0
The December Issue
 

Comments

Already Registered? Login Here
No comments made yet. Be the first to submit a comment

By accepting you will be accessing a service provided by a third-party external to https://magazine.joomla.org/