Building an HTML Form Importer for WordPress

Git: https://bitbucket.org/MrFlannagan/better-html-form-importer

I was working on a project for work recently where I had to take an old, terribly-coded, ASP.NET processed HTML form and move it over to WordPress.  This thing had a lot of inputs, selects and such.  Did a quick search for a tool that might help simplify the process.  For one thing, it would save some real time and for another, I HATE doing monotonous tasks.

Well, I had no luck finding anything to help me so I thought, “What if I could build it?”  I love toying with building scrapers and parsers.  I knew this would be sloppy as every crappy HTML form has completely different code, different structures, different mark-up and more times than not broken HTML.

I sketched a few concepts and settled on something that would work with <div> layouts, <table> and whatever else.  Will it work perfect? I knew right away that was not the case, but if I could get 80%, hell 60% right on most then it would save some real time.  And if I built some tools to help simplify fixing the errors, well hell, we saving some serious time, baby!

So what I settled on was this:

  • Recursively loop through all nodes in a block of HTML using DOMDocument.
  • Save current element if it’s just text/characters.
    This part gave me some issues.  I tried using preg_replace to remove all white space and detect if the element is a word but it just wasn’t catching them all.  So I decided to split the whole thing into individual characters.  If looping through encountered a character then it breaks and uses that node as the current element.  Basically this just bypasses any strange white space characters.

    if( 3 == $node->nodeType ) {
        $allchars = str_split( $nValue );
        foreach( $allchars as $achar ) {
            if( ctype_alnum ( $achar ) ) {
                $prevText = $nValue;
                break;
            }
        }
    }
  • If the element is a form input of some sort, create a label input with value of previous textual element (saved as current element in previous step), create an element name input,  and finally create a duplicate of current element with it’s value in a table row and display it.

This way every form element would be displayed with a label best guessing its previous label and allow that to be edited easily.  It would also display an editable field for the name of the element and one for its value.

So far, I’m really happy with the results.  (https://bitbucket.org/MrFlannagan/better-html-form-importer)

My plan is to finish all the different types of inputs (checkboxes, file uploads, etc.) and then add a way to drag and sort them, create additional elements and more.  I’ll give users the option to create a shortcode where they can display it, set if it emails/stores results in database and finally I’m hoping to make use of the Gravity Forms API and allow it to import there and other form plugins I can figure out.