Implementing ADAM xml content in a Drupal installation, part 1: proof of concept

A Cadeuceus inside a Drupal logo.ADAM is one of the largest online consumer health encyclopedias and is leveraged by many health care institutions as an patient-centric encyclopedia of information on diseases, injuries, and health care procedures.  This tutorial walks you through the steps of importing ADAM xml content into a Drupal website installation.

Important considerations:
All XML and XPath Parser settings are case-sensitive. Be careful with your typing.

Installing Modules

First, make sure you have the proper modules installed. In order to import ADAM content into a Drupal instance, you’re going to need to use Feeds (link). The standard Feeds module and subsequent sub-modules need some other modules to work.

  1. Install the Chaos Tools Suite (http://drupal.org/project/ctools)
  2. Install Jobs Scheduler (http://drupal.org/project/job_scheduler) [currently in alpha, but it seems to work just fine]
  3. Install the Feeds Module (http://drupal.org/project/feeds)
  4. Install the Feeds XPath Parser Module (http://drupal.org/project/feeds_xpathparser) [This module will enable Drupal to parse the XML data into one of the content types you’ll create in the next section]
  5. Install the Feeds Directory Fetcher Module (http://drupal.org/project/feedsfetcherdirectory) [This module will enable the Feed import to point to full directory, as opposed to a singular XML or HTML file]

Create A Content Type

Create a new content type, ct Injury Article (or similar)

Create fields that match your ADAM article type’s XML parsing. We’re going to start with an import of injury articles, which share the following structure:

  • Title
  • Definition
  • Alternative Names
  • Considerations
  • Causes
  • Symptoms
  • First Aid
  • DO NOT
  • Call immediately for emergency assistance if
  • Prevention
  • References
  • Review Date
  • Reviewed By
  • Department/Service Line (part of our taxonomy)
  • Specialty
  • Sub-specialty (might be the same as title)

Note: Make sure your machine-readable names make sense. Also, make sure that you select ‘Filtered text (user selects text format)’ for each field’s ‘Text Processing’ choice.

Create A Second Content Type

Create another content type that is mostly generic. This content type will be used by the feeds module to parse the XML data into the ‘Health Encyclopedia’ content type.

  • ct Injury XML Importer (There shouldn’t be any reason to add additional fields beyond the default ‘title’ field. This content type exists only to transfer content from the Feeds Importer to the ‘Health Encyclopedia content type)

Create a New Importer

  • Navigate to Structure->Feeds imports
  • Click Add importer
  • Give it a good name: Health Encyclopedia Content Importer
  • Give it a good description: An importer that will pull and parse ADAM xml content into the Health Encyclopedia content type

Create New Settings

Create new settings for the following settings sections:

  • Basic Settings: Attach the importer to the XML importer content type. In our case, ‘ct Injury XML Importer’. This content type is going to perform a simple gateway task: it will take the imported content and move it directly into the actual content type, ‘Health Encyclopedia’.
  • Fetcher:Choose ‘Feeds Fetcher Directory’. We’re going to move our ADAM xml files into a local directory, which we’ll then use to create the dynamic content used by the Drupal implementation. You can also do an import from a URL (HTTP Fetcher), which we’ll cover in an appendicies.
    • Settings for ‘Feeds Fetcher Directory:Change the default File Mask
      /.txt$/

      to allow for the parsing of XML documents:

      /.xml$/

      . Selct both ‘Re-fetch files that are modified’ and ‘Recursively scan directory’ for now. This can be changed at a later date if you find custom content is being over-written.

  • Parser:Choose XPATH XML Parser
    • XPATH XML Parser Settings: There shouldn’t be any settings to choose from at this point. We’ll come back and enter those using the XPATH syntax at a later time.
    • Note: Leave ‘Allow source configuration override’ checked.
    • Processor: Choose ‘Node Processor’, as we’re creating a new content type (a node).
    • Node Processor Settings: For now, you don’t have to update anything from the default vaules except ‘Content Type’. You’ll want that be the actual content type you want to end up modifying and adding to. In our case, ‘ct Injury Article’.
    • Node Processor Mapping: Here, you want to create a direct relationship between singular XML items in the source files (The content supplied by ADAM), and the various fields in the Health Encyclopedia content type you made. From the ‘Source’ dropdown, select ‘XPATH Expression’, and from the ‘Target’ dropdown, select the fields you want to map to each XPATH expression. *Note: the importer will automatically update each ‘XPATH Expression’ selection with an automatic ID number at a regular interval. As you click ‘Add’ for each mapped pair, you should see the ‘Source’ value update to xpathparser:0, xpathparser:1, xpathparser:2, and onwards to xpathparser:n. These will be associated with specific items from the XML source files back in the Parser Settings.)

Set Parser Settings

Now go back and select the ‘XML XPATH Parser Settings’. You’ll see there are a host of new fields here, each corresponding to one of the content type fields from the ‘ct Injury Article’ article content type. You’re going to give each of these a specific syntactical context that relates to one of the ADAM xml declarations. In other words, each field will be associated to its correlating section with the ADAM articles.

Context: Here you need to include the source files’ initial content definition. For our puposes, it’s adamContent, which in the files’ XML strcuture acts as the base for all other content section fields. Please check your own ADAM import to make sure the base structure hasn’t been changed

  • Syntax: //adamContent
    Note: the double strike inidcates the ‘base’ directory. For our purposes, we’ll be using a base directory of Public://ADAMsource. You can map the location of your public directory by navigating to Configuration->Media and inputting a directory in the ‘Public File system path’ field. In our deployment, the directory is sites/default/files.
    So, the final shorthand we get with //adamContent is as follows:
    sites/default/files/ADAMsource/look for ‘adamContent’ within each xml file to begin parsing content
  • Title @title
  • Definition textContent[@title=’Definition’]
  • Alternative Names textContent[@title=’Alternative Names’]
  • Considerations textContent[@title=’Considerations’]
  • Causes textContent[@title=’Causes’]
  • Symptoms textContent[@title=’Symptoms’]
  • First Aid textContent[@title=’First Aid’]
  • DO NOT textContent[@title=’Do Not’]
  • Call textContent[@title=’Call immediately for emergency medical assistance if’]
  • Prevention textContent[@title=’Prevention’]
  • References textContent[@title=’References’]
  • Review Date versionInfo[@reviewdate]
  • Reviewed By versionInfo[@reviewedby]
  • Department/Service Line (part of our taxonomy)
  • Specialty (part of our taxonomy)
  • Sub-specialty @title

Prep Your ADAM Content for Import

Note: this section assumes you have aleady downloaded your ADAM content using the TransADAM application or other means and have culled a selection of articles to perform a test import. For our purposes, we create an automated script that pulls 30 Ijury articles from the default ADAM directory.

  • Navigate to Configuration and select ‘File system’ within the ‘Media’ selections.
  • Input a directory that will house the eventual ADAM content originating files. These files will be used to generate the content that sits within the Drupal database. Once the content is imported, the files can be deleted from the server’s directory structure.
  • Create a folder within your specified directory and move your culled ADAM articles to that directory.
    • Example: our directory structure is sites/default/files, and our ADAM folder is ADAMsource. Our final location for the source ADAM files will be sites/default/files/ADAMsource/
    • Note: in future parts, we’ll examine how to import directly from a remote server or directory outside the Drupal installation, but for now, this test from a local directory should prove how easy it is to establish a massive health-related content base within a Drupal instance.

Navigate to /import (it should be available on your home screen)

  • Select the ‘ct Injury XML Importer’
  • Give your importer a title
    *Your title should again be descriptive. ‘Initial ADAM content import’, for example
  • Enter the local directory that houses your ADAM source files. In our example instance, the files are located in the ADAMsource directory within sites/default/files. Because we configued that location as our default Public Directory location, we enter the following into the Directory field: public://ADAMsource. Change yours as necessary
  • XPATH Parse Settings: these should carry over from your Feeds settings. Expand the menu, check the settings, and make sure they’re correct. You shouldn’t have to touch anything, but double-check just to make sure.
  • Hit ‘Save’

Depending on the number of new nodes created, you could generate an extremely lengthy and complex Notification. Should your installtion effort cause any errors, they will show up in red.

Navigate to your front page. Congratulations! You have successfully completed a proof-of-concept import for massive, detailed health encyclopedia content.


Photo of caduceus by takomabibelot, flickr.