Forcing Craft CMS 3's Redactor field to automatically display img alt text

1/13/2019

Recently, I've been working on adding alt text to all of the images on this website. Not only does this make the site more accessible, it's also good for SEO.

The site is built on Craft CMS 3, which uses Twig templates. In templates where I control the output, adding alt text is trivially simple:

<img 
    ...
    alt="{{ image.altText }}"
    ... 
>

Unfortunately, things are a little more complicated when using Craft's default WYSIWYG editor, Redactor. For basic usage, Craft's Redactor implementation works fine and is fully-integrated into Craft's asset management system (meaning that if I upload an image using the Redactor WYSIWYG editor, that image is automatically added to Craft's internal asset system).

When I save an entry and lookup its contents in the CMS database, I can see that Craft stores the image using a custom reference tag notation:

<img src="{ asset:55:url }" data-image="55">

This decouples the stored data from details that can be changed through Craft's asset manager, such as the image name or filesystem location. Later, when I output an entry's content into a Twig template ...

{{ entry.body }}

... Craft's Redactor Field class parses the reference tag notation via the _parseRefs method, which ultimately calls the Craft kernel's Element::parseRefs method

I was really hoping to hook into the parseRefs method but quickly realized that it's a glorified preg_replace that matches on the reference tag pattern and replaces matches inline. This isn't the greatest approach, especially since it's context blind and happily replaces matches enclosed within <pre><code></code></pre> blocks (if you look closely, I added spaces between the opening and closing {} braces in the code block above to avoid the issue).

The final output from parseRefs looks something like:

<img src="https://www.ismailzai.com/img/uploads/search1.png" data-image="55">

Sadly, there's no clean or in-built way to hook into or alter this behaviour so I needed a workaround.

I wanted to avoid anything cumbersome, like creating a new field type, and anything brittle, like hacking the kernel. That really left only one option: build a Twig extension to ingest the full HTML output of {{ entry.body }} and create a DOM model based on that. Then I could identify any img elements and lookup the assets referenced in their data-image attributes. If an asset had associated alt text, I could add an alt attribute to the DOM model and export the resulting HTML.

I used the PHP HTML Parser library to handle the DOM modelling, which made the rest of the implementation fairly trivial. First, I required the library in my Craft module's base path...

composer require paquettg/php-html-parser:dev-dev

... then in the Twig extension file I created, I just had to require the autoload.php file. The rest was simple:

public function addImgAltText( $html )
{
    $html = $html->getParsedContent();
    $dom = new Dom;

    $dom->setOptions([
      'strict' => false,
      'whitespaceTextNode' => true,
      'enforceEncoding' => null,
      'cleanupInput' => true,
      'removeScripts' => false,
      'removeStyles' => false,
      'preserveLineBreaks' => true,
      'removeDoubleSpace' => false,
    ]);

    $dom->load( $html );

    $imgs = $dom->find( 'img' );

    foreach ($imgs as &$img) {
      $tag = $img->getTag();

      $imgAssetNumber = $tag->getAttribute('data-image')['value'];
      $imgAsset = \craft\elements\Asset::find()->id( $imgAssetNumber )->one();

      if ( $imgAsset !== null ) {
        $imgAssetFields = $imgAsset->fieldValues;

        if ( array_key_exists("altText",$imgAssetFields) && $imgAssetFields["altText"] ) {
          $tag->setAttribute('alt', $imgAssetFields['altText']);
        }

      }

    }
    echo $dom->outerHtml;
}

Now I just had to update my Twig template to run entry.body through addImgAltText:

{{ addImgAltText(entry.body) }}

That's it -- so long as a given image has  alt text defined in its custom altText field, that text will be automatically fetched and displayed in the Redactor field's output.