Transliterations
From Open Siddur Project Development Wiki
Transliterations are a special type of parallel text. In that sense, a manual transliteration can be handled in the same way as a translation.
Manual transliterations, however, are discouraged, for the following reasons:
- There is more than one way to transliterate a work, and we want to be able to support as many transliteration styles as possible.
- Any edition of the siddur should have consistent transliterations throughout the text, instructions, and commentaries.
- We have a computer program that transliterates Hebrew text automatically.
When short blocks of transliteration are required in instructions or commentaries, the automated transliterator can be called using the j:segGen tag. The @type attribute is required on j:segGen; for transliterations, it has the value "transliteration". This tag is intended to be an analogue to tei:divGen for inline texts.
For example:
<j:segGen type="transliteration">בְּמִצְווֹתָיו</j:segGen>
indicates that an in-place transliteration of the Hebrew word is desired.
A processor should display some text related to the j:segGen, even if transliteration is turned off. Each processor may define its own fallback. Examples of fallbacks include:
- Displaying the untransliterated Hebrew text.
- Choosing a default transliteration table.
- Flagging an error to the user (discouraged, but conformant).
The formatting of the transliteration is handled entirely by the stylesheet.
In fully transliterated texts, the texts may be aligned at the discretion of the processor or stylesheet. They may conveniently be aligned at the same points at translation alignments when the text is both translated and transliterated.
The Automated Transliterator
For detailed technical information about the automated transliterator, see the XSLT 2.0 source code. The remainder of this file concerns the implementation details of the automated transliterator. The use of this particular implementation is not required by the JLPTEI guidelines.
The automated transliterator defines its own microformat, in the XML namespace (http://jewishliturgy.org/ns/tr/1.0). A transliteration table is a pluggable component that is used to define the correspondence between a Hebrew alphabet form and a form in another alphabet.
An example table (SBL transliteration style) is reproduced below:
<tr:table xmlns:tr="http://jewishliturgy.org/ns/tr/1.0"> <tr:tr from="ℵ" to="&righthalfring;"/> <tr:tr from="&bet;" to="b&underline;"/> <tr:tr from="&bet;&dageshormapiq;" to="b"/> <tr:tr from="ℷ" to="g&underline;"/> <tr:tr from="ℷ&dageshormapiq;" to="g"/> <tr:tr from="&dalet;" to="d&underline;"/> <tr:tr from="&dalet;&dageshormapiq;" to="d"/> <tr:tr from="&he;" to="h"/> <tr:tr from="&he;&dageshormapiq;" to="h"/> <tr:tr from="&vav;" to="w"/> <tr:tr from="&zayin;" to="z"/> <tr:tr from="&het;" to="h&dotbelow;"/> <tr:tr from="&tet;" to="t&dotbelow;"/> <tr:tr from="&yod;" to="y"/> <tr:tr from="&finalkaf;" to="k&underline;"/> <tr:tr from="&kaf;" to="k&underline;"/> <tr:tr from="&finalkaf;&dageshormapiq;" to="k"/> <tr:tr from="&kaf;&dageshormapiq;" to="k"/> <tr:tr from="&lamed;" to="l"/> <tr:tr from="&finalmem;" to="m"/> <tr:tr from="&mem;" to="m"/> <tr:tr from="&finalnun;" to="n"/> <tr:tr from="&nun;" to="n"/> <tr:tr from="&samekh;" to="s"/> <tr:tr from="&ayin;" to="&lefthalfring;"/> <tr:tr from="&pe;" to="p&underline;"/> <tr:tr from="&finalpe;" to="p&underline;"/> <tr:tr from="&pe;&dageshormapiq;" to="p"/> <tr:tr from="&finalpe;&dageshormapiq;" to="p"/> <tr:tr from="&finaltsadi;" to="s&dotbelow;"/> <tr:tr from="&tsadi;" to="s&dotbelow;"/> <tr:tr from="&qof;" to="q"/> <tr:tr from="&resh;" to="r"/> <tr:tr from="&shin;" to=""/> <tr:tr from="&shin;&shindot;" to="sˇ"/> <tr:tr from="&shin;&sindot;" to="s´"/> <tr:tr from="&tav;" to="t&underline;"/> <tr:tr from="&tav;&dageshormapiq;" to="t"/> <tr:tr from="&shevana;" to="&schwa;"/> <tr:tr from="&shevanach;" to=""/> <tr:tr from="&hatafsegol;" to="e˘"/> <tr:tr from="&hatafpatah;" to="a˘"/> <tr:tr from="&hatafqamats;" to="o˘"/> <tr:tr from="&hiriq;" to="i"/> <tr:tr from="&hiriq;&yod;" to="i&circumflex;"/> <tr:tr from="&tsere;" to="e¯on;"/> <tr:tr from="&tsere;&yod;" to="e&circumflex;"/> <tr:tr from="&segol;" to="e"/> <tr:tr from="&segol;&yod;" to="e&circumflex;"/> <tr:tr from="&patah;" to="a"/> <tr:tr from="&qamats;" to="a¯on;"/> <tr:tr from="&qamats;&he;" to="a&circumflex;"/> <tr:tr from="&holam;" to="o¯on;"/> <tr:tr from="&vav;&holam;" to="o&circumflex;"/> <tr:tr from="&holamhaserforvav;" to="o¯on;"/> <tr:tr from="&qubuts;" to="u"/> <tr:tr from="&vav;&dageshormapiq;" to="u&circumflex;"/> <tr:tr from="&qamatsqatan;" to="o"/> <tr:tr from="&qamats;&yod;&vav;" to="a¯on;yw"/> <tr:tr from="&cgj;" to=""/> <tr:tr from="&maqaf;" to="–"/> </tr:table>
The tr:table element contains one or more tr:tr elements. Each of those has a single @from attribute and a single @to attribute. Some composite characters (eg, &hiriq;&yod;) are treated differently from the sequence of characters &hiriq; followed by &yod;. All of those cases are shown in the example above.
Note that the dagesh may result in a different transliteration of a letter in all cases, except vav, where vav followed by dagesh in the transliteration table indicates a shuruq. The transliterator doubles dagesh hazak, but @from does not otherwise distinguish between dagesh hazak and dagesh kal. Virtual doubling can be disabled by adding a @double="no" attribute to the transliteration entry.
A shin with no shin-dot or sin-dot in @from represents the "silent sin", seen in "Issachar."
By default, the transliterator ignores silent vowel letters (eg, aleph, he). Some transliteration styles require these letters to be displayed in transliteration. In those cases, use an attribute @silent in addition to @to, to indicate how the silent letter should be transliterated.
The XML entities corresponding to Hebrew letters are in the source file hebrew.ent. Their use is not necessary, but helps for the display of some characters.
The transliterator also supports some options. These options are placed in tr:option elements, each of which has an @name and @value (both of which are case-sensitive). Supported options and values are shown below:
| Name | Values (default in bold) | Function |
|---|---|---|
| replace-tetragrammaton | on, off | Replace occurrences of the Tetragrammaton with liturgical pronunciations |
Testing Transliterations
For testing convenience, the standard "make" done in trunk will build a test file called /text/strongs/strongs-xlit.xml. This file contains all of the 8,674 unique Hebrew/Aramaic words found in the Tanach. It can be used to test transliteration styles in order to ensure correctness of the transliteration tables. In order to run a test, change the following line (at line# 54 in the strongs-xlit.xml file) in the file so that it specifies the correct transliteration table to use:
<tei:symbol value="strongs"/>
Valid transliteration table values are:
- strongs = Strongs Concordance style
- sbl = Society of Biblical Literature style
- mih = Modern Israeli Hebrew style
Once the transliteration style has been set in strongs-xlit.xml, run the following shell command (in the following example, the command is run from the trunk directory):
lib/transform.sh text/strongs/strongs-xlit.xml strongs-xlit.html
The resulting strongs-xlit.html file can be then browsed to verify the correctness of the transliterations. The basic format of each entry is:
H1 אָב 'ab ’âb
which is:
a b c d
where:
a = Strongs# b = Hebrew word c = Pronunciation d = the generated transliteration
The words are all listed on the first line in alphabetical order with their associated Strong's Concordance number for easy reference and pronunciation (to help those who don't read Hebrew well). On the second line, the generated transliteration is listed. Whenever a new transliteration style is defined, this file should be used to help verify the correctness of the transliterations.
Alternatively, if the tester only wants to test the effectiveness of the transliteration against a limited section of text, a custom transliteration xml file can be created. Note: the same considerations regarding setting the transliteration style and running the transform.sh command would apply for custom xml transliteration files.
Here is an example custom transliteration file which outputs a line of Hebrew text followed by the transliteration of that line (using the SBL's transliteration style):
<?xml version="1.0" encoding="UTF-8"?> <!-- This file is intended to test the functionality of transliteration of Hebrew-only texts without a parallel group --> <tei:TEI xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:j="http://jewishliturgy.org/ns/jlptei/1.0"> <tei:teiHeader> </tei:teiHeader> <tei:fsdDecl> <tei:fsdLink type="transliteration" target="transliteration-fsd.xml#fsTransliteration" /> </tei:fsdDecl> <j:conditionGrp> <tei:fs type="transliteration" xml:id="transliterator_on"> <tei:f name="table"><tei:symbol value="sbl"/></tei:f> </tei:fs> </j:conditionGrp> <j:links> <tei:link type="set" targets="#main #transliterator_on"/> </j:links> <tei:text> <tei:body xml:lang="he"> <j:concurrent xml:id="concurrency"> <j:selection xml:id="selection"> <tei:ptr xml:id="se1" target="#r1"/> <tei:ptr xml:id="se2" target="#r2"/> <tei:ptr xml:id="se3" target="#r3"/> <tei:ptr xml:id="se4" target="#r4"/> <tei:ptr xml:id="se5" target="#r7"/> <tei:ptr xml:id="se6" target="#r8"/> <tei:ptr xml:id="se7" target="#r9"/> <tei:ptr xml:id="se8" target="#r10"/> <tei:ptr xml:id="se9" target="#r11"/> <tei:ptr xml:id="se10" target="#r12"/> <tei:ptr xml:id="se11" target="#r13"/> <tei:ptr xml:id="se12" target="#r14"/> <tei:ptr xml:id="se13" target="#r5"/> <tei:ptr xml:id="se14" target="#r6"/> </j:selection> <j:view type="div"> <tei:div xml:id="main"> <tei:head>עֵירוּב תַּבְשִׁילִין</tei:head> <tei:ptr target="#range(se1,se14)"/> </tei:div> </j:view> <j:view type="p"> <tei:p xml:id="p1"> <tei:ptr target="#s1"/> </tei:p> <tei:p xml:id="p2"> <tei:ptr target="#s2"/> </tei:p> </j:view> <j:view type="s"> <tei:s xml:id="s1"> <tei:ptr target="#range(se1,se4)"/> </tei:s> <tei:s xml:id="s2"> <tei:ptr target="#range(se5,se14)"/> </tei:s> </j:view> </j:concurrent> <j:repository xml:lang="he"> <tei:seg xml:id="r1"> <tei:w xml:id="r1w1">בָּרוּךְ</tei:w> <tei:w xml:id="r1w2">אַתָּה</tei:w> <tei:w xml:id="r1w3"> <j:divineName> יְהוָה </j:divineName> </tei:w> </tei:seg> <tei:seg xml:id="r2"> <tei:w xml:id="r2w1">אֱלֹהֵינוּ</tei:w> <tei:w xml:id="r2w2">מֶלֶךְ</tei:w> <tei:w xml:id="r2w3">הָעוֹלָם</tei:w> </tei:seg> <tei:seg xml:id="r3"> <tei:w xml:id="r3w1">אֲשֶׁר</tei:w> <tei:w xml:id="r3w2">קִדְּשָׁנוּ</tei:w> <tei:w xml:id="r3w3">בְּמִצְוֺתָיו</tei:w> </tei:seg> <tei:seg xml:id="r4"> <tei:w xml:id="r4w1">וְצִוָּנוּ</tei:w> <tei:w xml:id="r4w2">עַל</tei:w> <tei:w xml:id="r4w3">מִצְוַת</tei:w> <tei:w xml:id="r4w4">עֵרוּב</tei:w> </tei:seg> <tei:seg xml:id="r5"> <tei:w xml:id="r5w1">לָנוּ</tei:w> </tei:seg> <tei:seg xml:id="r6"> <tei:w xml:id="r6w1">וּלְכׇל</tei:w> <tei:pc>־</tei:pc> <tei:w xml:id="r6w2">הַדָּרִים</tei:w> <tei:w xml:id="r6w3">בָּעִיר</tei:w> <tei:w xml:id="r6w4">הַזֹּאת</tei:w> </tei:seg> </j:repository> <j:repository xml:lang="arc"> <tei:seg xml:id="r7"> <tei:w xml:id="r7w1">בַּהֲדֵין</tei:w> <tei:w xml:id="r7w2">עֵרוּבָא</tei:w> </tei:seg> <tei:seg xml:id="r8"> <tei:w xml:id="r8w1">יְהֵא</tei:w> <tei:w xml:id="r8w2">שָׁרֵא</tei:w> <tei:w xml:id="r8w3">לָנָא</tei:w> </tei:seg> <tei:seg xml:id="r9"> <tei:w xml:id="r9w1">לְמֵיפֵא</tei:w> </tei:seg> <tei:seg xml:id="r10"> <tei:w xml:id="r10w1">וּלְבַשָּׁלָא</tei:w> </tei:seg> <tei:seg xml:id="r11"> <tei:w xml:id="r11w1">וּלְאַטְמָנָא</tei:w> </tei:seg> <tei:seg xml:id="r12"> <tei:w xml:id="r12w1">וּלְאַדְלָקָא</tei:w> <tei:w xml:id="r12w2">שְׁרָגָא</tei:w> </tei:seg> <tei:seg xml:id="r13"> <tei:w xml:id="r13w1">וּלְמֶעְבַּד</tei:w> <tei:w xml:id="r13w2">כָּל</tei:w> <tei:pc xml:id="r13pc1">־</tei:pc> <tei:w xml:id="r13w3">צָרְכָּנָא</tei:w> </tei:seg> <tei:seg xml:id="r14"> <tei:w xml:id="r14w1">מִיּוֹמָא</tei:w> <tei:w xml:id="r14w2">טָבָא</tei:w> <tei:w xml:id="r14w3">לְשַׁבְּתָא</tei:w> </tei:seg> </j:repository> </tei:body> </tei:text> </tei:TEI>
Here are the first few lines of output from this custom transliteration file:
עֵירוּב תַּבְשִׁילִין בָּרוּךְ אַתָּה יְהוָה bārûk̲ ʾattâ ʾăd̲ōnāy אֱלֹהֵינוּ מֶלֶךְ הָעוֹלָם ʾĕlōhênû melek̲ hāʿôlām
|