This topic is locked

Guide 59 - Convert HTML to Word Document

5/4/2023 8:23:54 AM
PHPRunner Tips and Tricks
fhumanes author

img alt
It is relatively frequent that we have texts in HTML and that we need to convert these texts to documents, in this case, Word documents.
I have created multiple solutions to produce invoices, reports, etc., in Word, but always starting from "plain" text, not rich text as is the case with HTML.
Objetive
Capture text in HTML format and then create a Word document, passing that text from HTML to the Word document.
DEMO: https://fhumanes.com/html_word/
If you are interested in this topic, keep reading the article at this link.

fhumanes author 5/4/2023

Technical Solution
img alt
As in the last articles, I have used PHPWord https://github.com/PHPOffice/PHPWord Composition of Word documents starting from templates, but I had to investigate this case, because I did not find a solution on the internet to convert the HTML into the whole of elements of the Word document and once the elements are obtained, paste them into the Template document.
As the number of conversion elements depends on the HTML content for these fields I have defined blocks and I have "cloned" them as many times as conversion elements I have obtained.
We have to keep in mind that not all HTML is valid for this type of conversion. Some of the items that are not valid are:

  • URL links Images, videos or objects external to the HTML file. H1, ..., As there is no style sheet, there is no good transformation either* Others pending to discover.
    In the file " PeticionWord.php ", there is all the code, which is:
    `<?php
    @ini_set("display_errors","1");
    @ini_set("display_startup_errors","1");

require_once("../include/dbcommon.php");

// Load the PHPWord library classes
require_once DIR . '/phpword_1.0.0/autoload.php';
use PhpOffice\PhpWord\Element\TextRun;
use PhpOffice\PhpWord\Element\Section;

// ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

$id= $_SESSION['id'] ; // identificación de la Petición

$decimal = new \NumberFormatter("es-ES", \NumberFormatter::DECIMAL);
$decimal->setAttribute(\NumberFormatter::MIN_FRACTION_DIGITS, 2);
$decimal->setAttribute(\NumberFormatter::MAX_FRACTION_DIGITS, 2); // by default some locales got max 2 fraction digits
$entero = new \NumberFormatter("es-ES", \NumberFormatter::DECIMAL);
$entero->setAttribute(\NumberFormatter::MIN_FRACTION_DIGITS, 0);
$entero->setAttribute(\NumberFormatter::MAX_FRACTION_DIGITS, 0);

// Template processor instance creation
$template_word = DIR.'/TemplatePetition.docx';
$templateProcessor = new \PhpOffice\PhpWord\TemplateProcessor($template_word);

// -------------------- ^ cabecera necesaria para las plantillas de Word ------------------
$sql="SELECT id, name, date, expose, request FROM html_word WHERE id = $id";
$resql=DB::Query($sql);
$data=$resql->fetchAssoc();

$templateProcessor->setValue('name', $data['name']);

// Date Local completed
$myDate = DateTime::createFromFormat('Y-m-d', $data['date']);
$formatter = new IntlDateFormatter('es_ES', IntlDateFormatter::LONG, IntlDateFormatter::LONG);
$formatter->setPattern("d 'de' MMMM 'de' yyyy");
$myDate = $formatter->format($myDate);
$templateProcessor->setValue('date', $myDate);

// ------------------------------HTML "expose" -------------------------------------
$phpWord = new \PhpOffice\PhpWord\PhpWord();
$section = $phpWord->addSection();
\PhpOffice\PhpWord\Shared\Html::addHtml($section, $data['expose'], false, false);
$elements_ar = $section->getElements();
$count = count($elements_ar); // Número de elementos generados por el HTML
$templateProcessor->cloneBlock('BEXPOSE',$count, true, true);

for ($i = 1; $i <= $count; $i++) {
$tag = 'expose#'.$i;
$templateProcessor->setComplexBlock($tag , $elements_ar[$i-1]);
}
// ------------------------------HTML "request" -------------------------------------
$section2 = $phpWord->addSection();
\PhpOffice\PhpWord\Shared\Html::addHtml($section2, $data['request'], false, false);
$elements_ar = $section2->getElements();
$count = count($elements_ar); // Número de elementos generados por el HTML
$templateProcessor->cloneBlock('BREQUEST',$count, true, true);

for ($i = 1; $i <= $count; $i++) {
$tag = 'request#'.$i;
$templateProcessor->setComplexBlock($tag , $elements_ar[$i-1]);
}

$temp_file = tempnam(sys_get_temp_dir(), 'Word');
$templateProcessor->saveAS($temp_file);

// ------------------ Operation with file result -------------------------------------------
$document = file_get_contents($temp_file);
unlink($temp_file); // delete file tmp
header("Content-Disposition: attachment; filename= petition.docx");
header('Content-Type: application/docx');
echo $document;`For any questions or what you need, contact me through my email: fernandohumanes@gmail.com
As always, I leave you the sources so you can try them on your PC's.