Architecture
From Open Siddur Project Development Wiki
|
This page describes the components of the Open Siddur technical architecture. For details on how we're doing at developing these parts and putting them together, see milestones.
All references to directories below are relative to the root directory of the source code.
Summary Description
The Open Siddur Project is developing a socially-networked online publishing platform for rendering print-ready digital files from publicly shared and user-contributed content. Textual content is semantically tagged, internally linked, and annotated. Graphical and other content is stored as annotation to text.
The platform will enable users to:
- share original and adapted content between friends, groups, or globally for creative reuse,
- maintain and share changes in the content of any shared work,
- compare and trace variations in the content of any shared work,
- arbitrarily recombine (i.e., remix) content selected from multiple sources into a new linear work,
- design sharable layout templates adjustable to the desired shape and format of the finished work.
Additionally, the platform will preserve and record a chain of metadata for all shared work, including:
- what source(s) it came from
- who wrote it, edited it, and/or proofread it
Summary Diagram
The platform architecture is summarized in the following diagram:
Infrastructure
- Wiki
- Serves documentation and acts as a temporary transcription interface
- Virtual Private Server (VPS)
- Serves the database, web applications, and demos
Backend
These parts provide format specifications and provide the backbone of the Open Siddur Project's publishing platform.
- XML Database
- An installation of eXist at jewishliturgy.org, which you may mirror locally. Stores all primary data and acts as an interface to client applications world by execution of XQueries through its REST interface.
- JLPTEI Specifications
- A TEI-based XML schema for representing liturgical texts. Also includes validation software, written in TEI ODD and Schematron. The source code is in the
schema/directory. - Transforms
- Conversion programs to translate JLPTEI into output formats, such as XHTML or PDF. The transform code is stored in git and served off the database. The source code is in the
code/transformsdirectory. - REST API
- XQuery APIs that run off the database and can be accessed via the database's REST interface. Provide basic functions for manipulating data in the database and functions that are used frequently by applications. The source code is in the
code/apidirectory.
Client applications
The client application is an integrated web-based application to be used to manipulate data on the database. As of now, there are a number of demos that act as a thin wrapper around the REST API and allow a user to do some manipulations of database data. The source code is in the code/apps directory.
The end-user client application should be a user-friendly web interface that can display, edit and compile JLPTEI data from the database.
Helper applications
- One-time converters
- If we are given data in a format that is not JLPTEI, but is consistently applied, then it can be converted to JLPTEI directly via one-time-use conversion applications. As we receive more contributed material, the importance of one-time converters will increase. Some existing converters are in the
code/input-conversiondirectory.
Content
Texts and annotated graphics form the basis of the Open Siddur Project's content. They are the reason we are writing all the software. Content falls into the following categories:
- Scanned documents
- Books or manuscripts as images. These are stored in a web-viewable compressed JPEG format on our website, and will eventually be served for transcription by the transcription application. They are currently manually added to the wiki for transcription. A list of current and wanted sources is on the Brainstorm session page.
- Contributed texts
- These are texts that are original works submitted to us (for now, through the mechanism described in Submissions HOWTO)
- Original text transcriptions
- Transcriptions are currently being collected on the wiki.
- Encoded texts
- These are the forms of text that can be processed by our transforms. As of now, only one-time converter software produces encoded text.
- Translations
- We are not ready yet to accept translations directly into the archive. If you are interested in translating, talk to us. If you already have translated material that you would like us to include, see Submissions HOWTO.
- Instructional material
- We are not ready yet to accept instructional material directly into the archive. If you are interested in writing it, talk to us. If you already have instructional material that you would like us to include, see Submissions HOWTO.
- Comments/essays
- We are not ready yet to accept commentary directly into the archive. If you are interested in writing it, talk to us. If you already have commentary that you would like us to include, see Submissions HOWTO.
Non-technical
In addition to the technical aspects of the project, there are also some non-technical things we do:
- Membership in the community
- While we cannot give legal advice, we are willing to help Jewish content providers assure bidirectional content compatibility with our project. We also maintain a list of projects with similar goals.
- Documentation
- We need end-user and mid-level documentation to ease people into contributing to the project and using the materials we already have.
- Project management
- We are looking to improve resource management in the project, including volunteer management to make sure anyone who wants to contribute can find a place in the project.