XML files
XML files

XML files combine of Unicode and XML markup providing an ideal format for customized displays and machine processing of the text. Tools for processing XML files are widely available without cost. Working with the XML files is less error prone than attempting to parse text files. XML files include a descriptive TEI header and may include Documentary Hypothesis (DH) markings.

XML files can be obtained in one of 4 ways:

  1. When viewing biblical text press the "XML" button at the bottom of a text page. The file will be displayed by the browser and can be saved via the browser's "Save Page as ..." command. The output always includes DH markings.
  2. XML files can also be obtained from the Server.xml server by entering a query URL into a browser address bar. No parameters other than the text citation are permitted. For example,
    https://tanach.us/Server.xml?Deut26:5-9
    contains Deuteronomy 26:5-9. The Server.xml output always includes DH markings. See the Servers page for more information about using the Server.xml server.
  3. XML files for entire biblical books can be obtained by clicking on the book name on the Home page. The middle row of the resulting page offers a table of available formats. Click on the "XML" item to view the entire book in XML format. Files obtained this way do not include DH markings. Files with and without DH markings are stored in the Books subdirectory as described below.
  4. A zipped archive of all XML files is available from Tanach.xml.zip as described below.

The display of markup of an XML file is dependent on the browser and browser settings. If the display is a markup-free jumble of letters, press the browser's "View page source" menu item to see the markup.

XML file directories:

The XML files for each Tanach book are contained in the Books subdirectory. It contains the 39 un-marked book files, Genesis.xml ... Chronicles 2.xml, and 5 DH-marked book files Genesis.DH.xml ... Deuteronomy.DH.xml. Examination of a typical file, Habakkuk.xml, provides insight into the file structure.

The Books directory also has a header file, TanachHeader.xml, containing the overall TEI header, transcription notes, and a character coding table, and an index file, TanachIndex.xml, containing the number of chapters and verses in each book. The brevity of the TanachIndex.xml file speeds determination of whether a citation is valid or not; the same information is obtainable on a book-by-book basis from the longer book files, however.

Finally, the Books subdirectory contains Tanach.xml.zip, a zipped archive of all files in the Books subdirectory. It has 46 book files as well as License and publisher information. The length and SHA-256 hash of Tanach.xml.zip file is logged to allow validation. An edited log of recent releases of Tanach.xml.zip is also available.

Assistance available:

The publisher will provide assistance in working with XML files.

  27.3