XML files combine of Unicode and XML markup providing an ideal format for customized displays
and machine processing of the text. Tools for processing XML files are widely available without cost.
Working with the XML files is less error prone than attempting to parse
text files. XML files include a descriptive TEI header
and may include Documentary Hypothesis (DH) markings.
XML files can be obtained in one of 4 ways:
- When viewing biblical text press the "XML" button
at the bottom of a text page. The file will be displayed by the browser and can be saved
via the browser's "Save Page as ..." command. The output always includes DH markings.
- XML files can also be obtained from the Server.xml server by entering a query URL into a
browser address bar. No parameters other than the text citation are permitted. For example,
https://tanach.us/Server.xml?Deut26:5-9
contains Deuteronomy 26:5-9. The Server.xml output always includes DH markings.
See the Servers page for more information
about using the Server.xml server.
- XML files for entire biblical books can be obtained by clicking on the book name on the Home
page. The middle row of the resulting page offers a table of available formats.
Click on the "XML" item to view the entire book in XML format.
Files obtained
this way do not include DH markings. Files with and without DH markings are stored in the
Books subdirectory as described below.
- A zipped archive of all XML files is available from
Tanach.xml.zip
as described below.
The display of markup of an XML file is dependent on the browser and browser settings.
If the display is a markup-free jumble of letters, press the browser's "View page source"
menu item to see the markup.
XML file directories:
The XML files for each Tanach book are contained in the
Books subdirectory. It contains
the 39 un-marked book files, Genesis.xml
... Chronicles 2.xml,
and 5 DH-marked book files Genesis.DH.xml
... Deuteronomy.DH.xml. Examination of a typical file,
Habakkuk.xml,
provides insight into the file structure.
The Books directory also has a header file, TanachHeader.xml,
containing the overall TEI header, transcription notes, and a character coding table,
and an index file, TanachIndex.xml,
containing the number of chapters and verses in each book.
The brevity of the TanachIndex.xml file speeds determination
of whether a citation is valid or not; the same information
is obtainable on a book-by-book basis from the longer book files, however.
Finally, the Books subdirectory contains
Tanach.xml.zip,
a zipped archive of all files in the Books subdirectory.
It has 46 book files as well as License and publisher information.
The length and SHA-256 hash of Tanach.xml.zip
file is logged to allow validation.
An edited log of recent releases
of Tanach.xml.zip is also available.
Assistance available:
The publisher will provide assistance in
working with XML files.