UXLC Leningrad Codex

Search for

The initial value, וְאָ֣הַבְתָּ֔, appears in Leviticus and Deuteronomy.

Spaces (`0020`) are important!

comparing and including .

Show .

Search principles

Specifying the target text

Examples

Search principles

The reference text is a string of Unicode characters from the Unicode/XML Leningrad Codex (UXLC). The user provides a search "target" indicating the desired text. The software finds only exact matches between the reference text and the target text.

Because the reference text always contains Hebrew characters (consonants, vowels, accents), and pseudo accents (ZWJ, CGJ, blank) a direct text-to-text comparison of the reference and target texts is nearly useless. For example, a direct search for בְּרֵאשִׁית would yield no results because the reference text contains accents: בְּרֵאשִׁ֖ית. A search for the consonants בראשית would also be empty because vowels and accents appear between the consonants in the reference text.

To overcome this limitation both the reference and target texts are "filtered" to have specified "Content" before the comparison as indicated in the table below:

Content	Included characters
consonants	Hebrew consonants and blank, x0020.
vowels	The above plus Hebrew vowels, plus maqaf, shin dot, sin dot, and sof pasuq.
accents	The above plus Hebrew accents, and two pseudo accents: the zero-width joiner (ZWJ), x200d and the combining grapheme joiner (CGJ), x034f.

The Coding page gives a display of these character definitions and their hexadecimal values.

Most problems arise from an unintentional mismatch between the Content of the target and the "Content" setting.

The target text must have a greater content than the desired Content setting. Although the software provides some warning messages, the user must always be careful to set the target text and Content correctly.

To see the name of a pulldown list, let the cursor hover over the list for a few seconds.

In the UXLC orthographic words are contained within XML tags (w, k, q) in each verse. The contents of these tags are concatenated together, usually with intermediate blanks, to form a single text, the verse text, to be searched for each verse. No blanks are added after an orthographic word ending in a maqaf, however. The user has a choice as to which variants (k, q, or both) are included in the verse text. A blank character is added at the start of each verse. This comparison does not cross verse boundaries.

Blanks are significant parts of both the verse and target texts. To facilitate awareness of leading and trailing blanks, '>' and '<' characters are printed on either side of the target text in the search results.

With the default setting, "verses and counts", the entire verse is displayed when matched. When the "Display mode" pulldown list is set to "counts only" only the count is given.

Specifying the target text

Two methods of entering the target text are available, set by the "Input mode" pulldown list.

Entering Unicode characters

Copying and pasting text into the input field in the "Unicode characters" input method is probably the easiest approach. Always copy text having at least the content that will be needed in the search. Using text with "accents" Content is recommended for all pasting into the input field. Then set the Contents pulldown to the desired target Content.

Entering hexadecimal values

The hexadecimal constant input method is helpful for situations in which the desired text isn't available to copy or a Hebrew keyboard isn't available. Enter the constants left-to-right as blank-separated hexadecimal numbers, e.g. 20 5dc 5d0 5d4 20 for > לאה <.

For either input method:

This search is not recommended when the target has more than one vowel or one accent on a consonant. The order in which the vowels or accents are entered after a consonant, the "mark ordering", is critical to a successful search.

Examples

Practice with a single book before trying the lengthier Torah or Tanach multiple book searches.

To start, go to Genesis 29:31 and, with the page content set to "Accents", copy the 5th orthographic word לֵאָ֔ה, Leah, to the clipboard. Return to the search page setting the "Book" pulldown to Genesis, the Input mode pulldown to "Unicode characters". Paste the text in the input text field. Set the Content pulldown to "consonants" and the Variants pulldown to "both". Click "Search".

A new page titled "Search of Genesis for …" should appear within a few seconds. It should indicate that 36 instances of >לאה< have been found.

To find a verse referring to Nebuchadnezzar, try entering five letters נבכדנ using the hex input method: 5e0 5d1 5db 5d3 5e0. Set Book to Tanach with "consonants" Content. The result shows 13 instances in 4 books.

The shalshelet accent ס֓ occurs infrequently. Enter 0593 in the hex field, set Book to Tanach, Content to "accents", and click "Search". The 46 verses containing the shalshelet are shown.



Font size:
Reference:

Spaces (0020) are important!

Search principles

Specifying the target text

Entering Unicode characters

Entering hexadecimal values

For either input method:

Examples

Spaces (`0020`) are important!