WireTextTools::markupToText() method

Convert HTML markup to readable text

Like PHP’s strip_tags but with some small improvements in HTML-to-text conversion that improves the readability of the text.

In 3.0.197+ inner content of script, style and object tags is now removed, rather than just the tags. To revert this behavior or to remove content of additional tags, see the clearTags option.

Note that this method differs from the Sanitizer::markupToText() method in that this method is newer, more powerful and has more options. But the two methods differ in how they perform markup-to-text conversion so you may want to review and try both to determine which one better suits your needs.

Usage

// basic usage
$string = $wireTextTools->markupToText(string $str);

// usage with all arguments
$string = $wireTextTools->markupToText(string $str, array $options = []);

Arguments

NameType(s)Description
strstring

String to convert to text

options (optional)array
  • keepTags (array): Tag names to keep in returned value, i.e. [ "em", "strong" ]. (default=none)
  • clearTags (array): Tags that should also have their content cleared. (default=[ "script", "style", "object" ]) Since 3.0.197
  • splitBlocks (string): String to split paragraph and header elements. (default="\n\n")
  • convertEntities (bool): Convert HTML entities to plain text equivalents? (default=true)
  • listItemPrefix (string): Prefix for converted list item <li> elements. (default='• ')
  • linksToUrls (bool): Convert links to (url) rather than removing? (default=true) Since 3.0.132
  • linksToMarkdown (bool): Convert links to [text](url) rather than removing? (default=false) Since 3.0.197
  • uppercaseHeadlines (bool): Convert headline tags to uppercase? (default=false) Since 3.0.132
  • underlineHeadlines (bool): Underline headlines with "=" or "-"? (default=true) Since 3.0.132
  • collapseSpaces (bool): Collapse extra/redundant extra spaces to single space? (default=true) Since 3.0.132
  • replacements (array): Associative array of strings to manually replace. (default=[' ' => ' '])

Return value

string

See Also


WireTextTools methods and properties

API reference based on ProcessWire core version 3.0.244

Latest news

  • ProcessWire Weekly #560
    In the 560th issue of ProcessWire Weekly we'll check out the latest core updates, cover newly released ProcessWire modules, and more. Read on!
    Weekly.pw / 1 February 2025
  • ProcessWire 3.0.244 new main/master version
    ProcessWire 3.0.244 is our newest main/master/stable version. It's been more than a year in the making and is packed with tons of new features, issue fixes, optimizations and more. This post covers all the details.
    Blog / 18 January 2025
  • Subscribe to weekly ProcessWire news

“To Drupal, or to ProcessWire? The million dollar choice. We decided to make an early switch to PW. And in retrospect, ProcessWire was probably the best decision we made. Thanks are due to ProcessWire and the amazing system and set of modules that are in place.” —Unni Krishnan, Founder of PigtailPundits