IRC Conference/logs/2009-08-16

Please see the summary. Any questions on anything here should please be directed at the mailing list.

 [2009-08-16 10:54:26] <@realazthat> hi [2009-08-16 10:55:04] *** realazthat changed topic to The Jewishliturgy Project. -IRC Conference at 11:00 AM EST- agenda: http://wiki.jewishliturgy.org/IRC_Conference. [2009-08-16 10:55:13] *** chaimss (i=44c0d935@gateway/web/freenode/x-urfjkwqzapwffvpv) joined [2009-08-16 10:55:20] Hi to you, too. [2009-08-16 10:55:21] *** realazthat changed topic to The Jewishliturgy Project. -IRC Conference at 11:00 AM EST- agenda: http://wiki.jewishliturgy.org/IRC_Conference. Please feel free to ask any questions that you might have. [2009-08-16 10:55:28] *** EfraimDF (i=efeins@dhcp-0016640289-5b-9d.client.student.harvard.edu) joined [2009-08-16 10:56:05]  hi [2009-08-16 10:56:21] *** Azriel (i=182e4988@gateway/web/freenode/x-esxssvfpzqtsmhha) joined [2009-08-16 10:56:32] Good morning [2009-08-16 10:56:40]  good morning [2009-08-16 10:56:50]  hi [2009-08-16 10:57:11] *** ChanServ sets channel #jewishliturgy mode +o EfraimDF [2009-08-16 10:57:33] *** ilancohen (i=47ad4d27@gateway/web/freenode/x-qluihfharqhtgvzr) joined [2009-08-16 10:57:46] *** qwebirc9555 (i=183c14fc@gateway/web/freenode/x-molvbpbewuesafcu) joined [2009-08-16 10:57:50] <@EfraimDF> hello/welcome [2009-08-16 10:57:54] hello [2009-08-16 10:58:00] *** qwebirc9555 (i=183c14fc@gateway/web/freenode/x-molvbpbewuesafcu) left [2009-08-16 10:58:08] *** qwebirc87398 (i=183c14fc@gateway/web/freenode/x-xwwzyytrlgvuupki) joined [2009-08-16 10:58:19] <@EfraimDF> if you want a real nickname, use the /nick command [2009-08-16 10:58:46] <@EfraimDF> the web gateway assigns a random one by default [2009-08-16 10:59:30] *** qwebirc34334 (i=183c14fc@gateway/web/freenode/x-mcqzmdnovhnyykfk) joined [2009-08-16 10:59:49] Oh IRC, how I've missed you. [2009-08-16 11:00:00]  heh [2009-08-16 11:00:12] *** qwebirc87398 changed nick to SHF [2009-08-16 11:00:25] *** qwebirc34334 quit (Client Quit) [2009-08-16 11:00:31]  Hello [2009-08-16 11:00:36]  Hi all [2009-08-16 11:00:56] *** qwebirc30319 (i=62d86f4f@gateway/web/freenode/x-jsipdfacfdhnmdfa) joined [2009-08-16 11:01:08] <@EfraimDF> Use /nick to set your name [2009-08-16 11:01:13] *** chaimss changed nick to Chaim [2009-08-16 11:01:30] <@EfraimDF> It's 11:00 ET/6:00 IT now... I think we can go around a bit and if everyone here can introduce themselves with a name/short sentence about background/interests, we can get started. [2009-08-16 11:02:16] <@EfraimDF> eg... I'm Efraim Feinstein, project lead developer. I'm interested in all aspects of getting this project going. [2009-08-16 11:02:28]  I am Azriel, I joined Efraim a few months back and hope to help bring the project to fruition [2009-08-16 11:03:01] *** qwebirc30319 changed nick to evefeinstein [2009-08-16 11:03:14] <@EfraimDF> (btw this chat will be logged and the log will be posted online) [2009-08-16 11:03:42] <@EfraimDF> (no need to wait on protocol now, just pipe in...) [2009-08-16 11:03:42] i'm efraim's wife. here mostly out of curiosity :) [2009-08-16 11:03:44]  Hi, I'm Shira and I'm interested in helping out [2009-08-16 11:03:47] Ilan Cohen - I'm interested in many things, but will likely put effort primarily into development, specifically of web application frontends - the interface. I haven't done muh work yet, but I hope to once I come backfrom vacation (in September) [2009-08-16 11:04:40] *** Aharon (n=chatzill@c-76-116-248-242.hsd1.pa.comcast.net) joined [2009-08-16 11:04:49] <@EfraimDF> hi Aharon, we're just doing introductions. [2009-08-16 11:05:05]  Good morning and shavuah tov everyone [2009-08-16 11:05:13]  I'm Chaim- I'm Azriel's friend, and while I'm just starting to learn how to program now and probably won't be much help for a few years, I'm definitely excited about the idea and would like to do what I can to help. [2009-08-16 11:05:19] Good morning, shavua tov. [2009-08-16 11:05:37] I'm Yonah Lavery. I think this is a wonderful project but can only help with either mindless grunt work or anything artsy/original. [2009-08-16 11:05:43]  Chaim: programming is a small part of the project [2009-08-16 11:06:22] <@EfraimDF> OK. We have something of an agenda. Probably the first thing to do is to go through what the aspects of development are and how far along each one is [2009-08-16 11:06:27] <@EfraimDF> And maybe a bit about what it will need [2009-08-16 11:07:04] <@EfraimDF> I think all of you are on the email list [2009-08-16 11:07:11]  Ok, great, so maybe I can do more now :) [2009-08-16 11:07:22] <@EfraimDF> So, you probably know that there are a number of different aspects of the project [2009-08-16 11:08:01] <@EfraimDF> Globally, the idea is to create an archive of siddur material that can be parsed for all applications by any number of interfaces [2009-08-16 11:08:30] <@EfraimDF> And to provide one web-based interface for working collaboratively on development [2009-08-16 11:08:59] <@EfraimDF> and (probably a different one) for end users to be able to make their own choices [2009-08-16 11:09:18] <@EfraimDF> using the archival material to develop a personal siddur [2009-08-16 11:09:50] <@EfraimDF> where "personal" can mean "for an individual", "for a community" and such [2009-08-16 11:10:21] <@EfraimDF> (by the way, if anyone has any comments in the middle of text streaming, just put up a ? to the screen) [2009-08-16 11:10:54] <@EfraimDF> In order to get there, we need to make a number of developments. [2009-08-16 11:11:14] <@EfraimDF> Before we can have the archive, for example, we need a format for storing the archive. [2009-08-16 11:11:41] <@EfraimDF> and a place to store the data, including standard protocols for how the data store works [2009-08-16 11:11:55] <@EfraimDF> None of this will work without having the data [2009-08-16 11:12:18] <@EfraimDF> And, we need an interface both for entering data and manipulating it in all the ways a user might want to [2009-08-16 11:12:25] <@EfraimDF> Any questions/comments so far? [2009-08-16 11:12:43] ? [2009-08-16 11:12:46]  ? [2009-08-16 11:12:49] <@EfraimDF> Ilan? [2009-08-16 11:13:08] Just wondering if the plan is to come out of this with a clear game plan, or just to get everyone on the same page. [2009-08-16 11:13:33] <@EfraimDF> first we need to get everyone on the same page. Some of the development has already begun [2009-08-16 11:13:51] <@EfraimDF> We don't yet have sets of milestones/development priorities [2009-08-16 11:13:59] <@EfraimDF> and that's something we need to address. [2009-08-16 11:14:00] ok [2009-08-16 11:14:03] <@EfraimDF> Chaim? [2009-08-16 11:15:06] *** qwebirc71045 (i=546eeee4@gateway/web/freenode/x-iwozrbncilkgmbld) joined [2009-08-16 11:15:24] <@EfraimDF> (hello. To choose a nickname, use the /nick command ) [2009-08-16 11:15:25]  When you say 'a place to store the data,' why wouldn't a normal server work? Is something special needed? [2009-08-16 11:16:07] <@EfraimDF> It is a server. It was mostly an issue of choosing which technology would best suit the purpose [2009-08-16 11:16:44] <@EfraimDF> Now on to where we are... [2009-08-16 11:16:52]  thanks [2009-08-16 11:17:47] <@EfraimDF> The definition of the data storage format has made a lot of progress. It is an extended version of the XML format developed by the Text Encoding Initiative (TEI), which we are calling JLPTEI (Jewish Liturgy Project TEI) [2009-08-16 11:18:04] *** qwebirc71045 quit (Client Quit) [2009-08-16 11:18:17] <@EfraimDF> The basics of how one would use the format are now on the wiki [2009-08-16 11:19:01] <@EfraimDF> And a set of very basic schemas written in the TEI's schema generating language (ODD) are in our subversion archive (which is hosted at Google Code) [2009-08-16 11:19:44] <@EfraimDF> As of now, the content models of the data are not completely worked out. Probably, most of what we need to do is make them tighter (more restrictive) than they are now [2009-08-16 11:20:13] <@EfraimDF> And that includes modifying the schemas themselves, and generating rule-based validators for use on-the-fly (eg, by the web application) [2009-08-16 11:20:54] <@EfraimDF> We also need better documentation, and to make sure that the schema guidelines make sense to someone other than me [2009-08-16 11:21:12] <@EfraimDF> The schema guidelines' intended audience is developers [2009-08-16 11:21:21] <@EfraimDF> The next part is the data store. [2009-08-16 11:21:47] <@EfraimDF> I've been playing with a native XML database called eXist. We have a server up and running, although not all the features that I want are working yet [2009-08-16 11:22:17] <@EfraimDF> It can (as of now), store data and data can be retrieved, and it can serve as a platform for the API [2009-08-16 11:22:31] <@EfraimDF> (the methods by which applications interface with the data) [2009-08-16 11:22:39] ? [2009-08-16 11:22:42] <@EfraimDF> Ilan? [2009-08-16 11:23:00] *** SarahA (i=546eeee4@gateway/web/freenode/x-ffxwjexsrtloqzta) joined [2009-08-16 11:23:03] Do we have demos up - proofs-of-concept? [2009-08-16 11:24:05] <@EfraimDF> Not quite yet. Most of the db stuff we have working so far is pretty low-level. If someone wanted to get involved in development, we could certainly make access available. The source code of all the APIs that are in use is up on subversion. [2009-08-16 11:24:19] <@EfraimDF> But, this is a good place to turn over to Azriel, who will talk about the front ends [2009-08-16 11:24:29] Ok. Maybe that'll be something I can work on. [2009-08-16 11:24:46] *** maslen (n=a@ool-18bed282.dyn.optonline.net) joined [2009-08-16 11:24:53] <Azriel> I was hoping to have a proof of concept to show today [2009-08-16 11:25:04] <Azriel> but unfortunately that wasn't possible [2009-08-16 11:25:21] When do you think it will be done ? [2009-08-16 11:25:44] Once it's available, many more people (I'm immediately thinking of myself) would be able to help [2009-08-16 11:25:48] <Aharon> just a protocol request.... [2009-08-16 11:26:04] <Azriel> a proof of concept can be completed very soon [2009-08-16 11:26:06] <Aharon> please introduce yourself if you haven't already before asking a question, thanks! [2009-08-16 11:26:33] <Azriel> the first web application that I am working on [2009-08-16 11:26:48] <Azriel> is the transcription framework [2009-08-16 11:27:03] <Azriel> so that we can begin to take in contributions to the project [2009-08-16 11:27:18] Azriel - I'd like to talk to you (later) about how I can help with this. [2009-08-16 11:27:24] i have a question about this [2009-08-16 11:27:29] <@EfraimDF> Eve? [2009-08-16 11:27:29] <Azriel> ilancohen: sure [2009-08-16 11:27:43] i'm not a computer person, so i have no idea how much work this would be to set up [2009-08-16 11:28:18] but i think some people would find it easier to transcribe things if they could do it without downloading the hebrew keyboard, say, by clicking on keys on the screen [2009-08-16 11:28:33] <@EfraimDF> I think Azriel can answer that question :-) [2009-08-16 11:28:38] <Azriel> indeed this is planned [2009-08-16 11:28:44] great! [2009-08-16 11:29:12] (that was totally not a planted question, btw :)) [2009-08-16 11:29:32] <Chaim> ? [2009-08-16 11:29:32] *** moshe (i=18bed282@gateway/web/freenode/x-gknsnrdprhwchmzt) joined [2009-08-16 11:29:37] <Azriel> Chaim? [2009-08-16 11:29:38] <@EfraimDF> Chaim? [2009-08-16 11:29:39] *** maslen quit (Client Quit) [2009-08-16 11:29:57] <SHF> And what about transcribing in other formats, not directly into a website? [2009-08-16 11:30:03] <SHF> Like with Mellel or something else [2009-08-16 11:30:24] Azriel: If I may ask, were you working on anything besides the online tool for transcribing? [2009-08-16 11:30:30] <Chaim> I personally would prefer using the keyboard, so dual-entry would be nice too ;) But not for me, but for others, it would probably have to be a low-tech appelet [2009-08-16 11:31:01] <@EfraimDF> SHF: Ultimately, we don't care what you type with. As long as the text appears to us as Unicode Hebrew.  I don't know Mellel well enough to tell you whether it is. [2009-08-16 11:31:17] <Azriel> moshe: yes, I am helping Efraim with the specs for storing the data [2009-08-16 11:31:22] <@EfraimDF> Chaim: Nothing prevents you from not using the keyboard applet. [2009-08-16 11:31:28] <Azriel> moshe: and currently administering the wiki [2009-08-16 11:31:55] <@EfraimDF> Chaim: It's just a crutch to help people who don't know how to use Hebrew keyboards (and hopefully help them learn the keyboard so they can become faster typists) [2009-08-16 11:32:06] <Azriel> SHF: I am not familiar with the program, but in the worst case, it can probably be copied and pasted [2009-08-16 11:32:12] <Azriel> SHF: I will look into it [2009-08-16 11:32:36] <@EfraimDF> Which brings up another aspect of what's needed to keep the project working -- system adminstration [2009-08-16 11:33:27] <Chaim> @EfraimDF: Ah, ok. Good. But I think SHF's on to something, could someone import, say, a txt file instead of using the site? For example, I can see myself doing the transcription on a bus, etc. when there's no connection. [2009-08-16 11:33:32] <@EfraimDF> Right now, Azriel and I are pretty much doing all the administration. The wiki is being hosted on a Dreamhost account, use of which was donated to us. The database is being hosted on a Virtual Private Server (VPS) [2009-08-16 11:34:12] <Azriel> Chaim: in addition to being able to copy/paste, I hope to have eventually have Google Gears support [2009-08-16 11:34:21] <Azriel> Chaim: that will allow the application to work offline [2009-08-16 11:34:31] <Azriel> Chaim: and sync up when you connect [2009-08-16 11:34:51] <Aharon> @chaim: Yes [2009-08-16 11:35:00] <Aharon> you can open up a simple text file [2009-08-16 11:35:14] <Aharon> and transcribe hebrew in unicode [2009-08-16 11:35:23] <SHF> @Chaim - tangent, but Bolt bus has internet, too! :) [2009-08-16 11:35:41] <Aharon> and then cut and paste that later on, into the textarea in the transcription interface and click submit [2009-08-16 11:35:47] <SHF> Also, question about proof of concept - what about the Haggadah? Didn't that go all the way from the database to a PDF? [2009-08-16 11:35:51] Azriel: If I may ask, the unicode hebrew that you're using, are the letters + the nikud a single character? or is it two characters one after the other? [2009-08-16 11:36:05] <@EfraimDF> SHF: The trick there was that there was no database! [2009-08-16 11:36:21] <Azriel> moshe: Efraim can best answer that [2009-08-16 11:36:37] <@EfraimDF> SHF: The differences between what we want and what happened in the POC were that: [2009-08-16 11:37:07] <@EfraimDF> SHF: (1) the POC used a very simplified version of the XML encoding, which turns out to not be scalable to everything we need to do and [2009-08-16 11:37:07] <SHF> Sorry if I'm not understanding. And I'm not clear on how to direct message here. Sorry. [2009-08-16 11:37:39] <@EfraimDF> SHF: (2) it used the filesystem + a set of command line based utilities to do all the transforms [2009-08-16 11:37:48] <Aharon> POC stands for (Proof-of-Concept) [2009-08-16 11:38:14] <@EfraimDF> SHF: It taught me (and hopefully us as a group) a lot of how to do things, but it's not really usable by the average person who doesn't have time to fiddle with it and make it work [2009-08-16 11:38:19] Oh good, I'm used to it being People of Colour and was really confused. [2009-08-16 11:38:43] <@EfraimDF> SHF: Part of what we need to get this project more mainstream is to make it easier to contribute/develop [2009-08-16 11:39:03] <@EfraimDF> Moshe: letters+ nikud are multiple Unicode codepoints (characters) [2009-08-16 11:39:45] <@EfraimDF> Moshe: they are composed later on by some combination of the fonts and font renderers [2009-08-16 11:39:46] EfraimDF: So then could we use OCR software just for scanning the letters, and the manually add int he nikud ? I think (hope) that would be a faster way of doing it [2009-08-16 11:40:09] <@EfraimDF> Moshe: I haven't found any OCR software that even got the consonants remotely right [2009-08-16 11:40:20] Scanned at 300 DPI grayscale ? [2009-08-16 11:40:22] :( [2009-08-16 11:40:41] <Aharon> We would certainly invite more people to start developing an open source Hebrew OCR based on Tesseract [2009-08-16 11:41:05] <Azriel> I have a friend who is currently attempting to train Tesseract [2009-08-16 11:41:08] <@EfraimDF> OCR (optical character recognition) tech is improving a lot. It's just not there yet [2009-08-16 11:41:15] let me see what I could do with one of the pictures from the site.... [2009-08-16 11:41:23] <Azriel> but I do not know how well he will fare [2009-08-16 11:41:27] * moshe boots up his server [2009-08-16 11:41:39] <@EfraimDF> Moshe: If you need it, we actually have higher quality pictures than the ones on the site [2009-08-16 11:41:43] <Azriel> http://jewishliturgy.org/base/sources/Baer-Seder_Avodat_Yisrael/ [2009-08-16 11:42:06] <Azriel> the originals are huge [2009-08-16 11:42:17] that's good. Huge is a + for OCR [2009-08-16 11:42:18] <@EfraimDF> Moshe: The original scans were taken at 300dpi, and each image is about 15MB. They were compressed into JPGs for the webserver to get them down to 300KB. [2009-08-16 11:42:26] <Aharon> I actually did see a Hebrew OCR that was rather perfect for text, and about 80% with nikkud, but it wasn't an open (or even commercial) product, while searching for this in Israel [2009-08-16 11:42:47] EfraimDF: : Where can I get some of the original pictures? [2009-08-16 11:42:59] <Chaim> I've seen it, but it was expensive (upward of $45 [2009-08-16 11:43:00] <Chaim> ) [2009-08-16 11:43:13] <Chaim> And I have no idea how well it works. [2009-08-16 11:43:17] <@EfraimDF> Moshe: From me or Azriel. Email me privately and I'll give you access. It's a 6GB download :-) [2009-08-16 11:43:30] I'd have to download them all ? [2009-08-16 11:43:37] <Azriel> no [2009-08-16 11:43:41] <@EfraimDF> No, you can select individual pics if you want [2009-08-16 11:43:53] Could I request an email/DCC of just one? [2009-08-16 11:44:09] <@EfraimDF> One is too big for email. I can get it to you by sftp [2009-08-16 11:44:31] <@EfraimDF> or yousendit.com or something like that [2009-08-16 11:44:47] gmail allows up to 25 MB now [2009-08-16 11:45:07] <@EfraimDF> In terms of open source, we want the entire *required* toolchain to be FOSS (free/open source software) [2009-08-16 11:45:36] that's a different issue :( [2009-08-16 11:45:36] <@EfraimDF> However, if someone finds a good closed source OCR product, they are free to use it on their own [2009-08-16 11:45:38] <@EfraimDF> Moshe: I was not aware of that [2009-08-16 11:45:53] <@EfraimDF> Moshe: email me efraim.feinstein@gmail.com and I'll send you a picture [2009-08-16 11:46:30] <@EfraimDF> Any other questions/comments related to where we are now? [2009-08-16 11:46:53] <Aharon> This was an important point regarding the values of the JLP/Open Siddur project and relates to what we mean by "Open" [2009-08-16 11:47:22] <@EfraimDF> Please continue, Aharon. :-) [2009-08-16 11:47:41] As long as any closed source program doesn't change the licensing, it's an option though? [2009-08-16 11:48:11] <@EfraimDF> Moshe: Yes. Fortunately, most software companies make no claim to the data produced by their software (Wolfram Research being a notable exception) [2009-08-16 11:48:42] EfraimDF: Azriel'll email it to me, thanks. [2009-08-16 11:48:49] Wolfram is Mathematica ? [2009-08-16 11:49:21] <@EfraimDF> Moshe: By the time it gets to us, it should be indistinguishable whether it came from OCR or was typed in. (Yes, the TOS of Wolfram Alpha makes an IP claim on search results) [2009-08-16 11:50:27] <Chaim> But wouldn't using OCR still require a human checking to make sure it got it right? [2009-08-16 11:50:28] <Aharon> Recently, a journo from Tablet asked us what we meant by "Open" in Open Siddur [2009-08-16 11:50:46] journo? [2009-08-16 11:50:51] Tablet? [2009-08-16 11:50:52] <Aharon> journalist [2009-08-16 11:50:57] <Aharon> Tablet Magazine, [2009-08-16 11:50:58] Ah. [2009-08-16 11:51:06] Chaim: Yes, but considering how slow most people type hebrew (due to lack of experience), editing could be significantly faster than retyping it from scratch [2009-08-16 11:51:08] <@EfraimDF> Chaim: yes, it would. Actually, the quality control procedures are something we need to develop. We have some ideas as to how it would be done. [2009-08-16 11:51:10] <Aharon> formerly known as Nextbook Magazine [2009-08-16 11:52:25] <Aharon> Open = Open source yes, but also, open access for sharing, contributing and adapting content of the siddur from Jews all over the world, in all the different languages they speak [2009-08-16 11:53:18] <@EfraimDF> The idea is to distinguish "open" as a buzzword from "open" as truly "free and open source" [2009-08-16 11:53:26] <@EfraimDF> we aim for the latter [2009-08-16 11:53:36] <Chaim> ?So there would have to be a way for someone to create a Siddur, and then in "one click" (or whatever the number is) translate that specific Siddur into another language without having to go throuth the whole process again. [2009-08-16 11:53:54] <Chaim> (oops, meant to cut everything after the '?') [2009-08-16 11:54:13] <@EfraimDF> Chaim: I doubt we'll ever have one-click translation. We could (in theory) have one click selection of a translation [2009-08-16 11:54:14] <Aharon> the content is limited by what is Free. [2009-08-16 11:54:21] <Azriel> Aharon came up with the word "Siddur recipe" [2009-08-16 11:54:51] <Azriel> that one would be able to choose all of the options from the nusach to the translations [2009-08-16 11:55:01] <Aharon> this is why the project needs to be Open [2009-08-16 11:55:22] <@EfraimDF> (that is, translation still requires humans to do it. If we find a way to make a good translation done by a computer, we can all retire to Tahiti or something like that ;-) ) [2009-08-16 11:55:49] <Azriel> we would provide some sensable "recipes" that are the most common [2009-08-16 11:55:54] <Azriel> heh [2009-08-16 11:55:55] <Chaim> @EfraimDF: With all the money you made from this 100% open project ;) [2009-08-16 11:55:56] <Aharon> it simply is not possible for a small team to even solicit all the content from the vast corpus of content out there needed to make siddurim suitable for every Jew in every community [2009-08-16 11:56:19] <Chaim> Aharon: Maybe not at first, but as things grow, and more people become interested... [2009-08-16 11:56:58] starting with sefard/ashkenaz + Eretz Yisrael/outside sounds like a decent start [2009-08-16 11:56:58] <Aharon> exactly, as a social network built around the sharing of texts and collaborative projects (translating, transcribing, etc) [2009-08-16 11:57:27] If only there was some magical way to make radio buttons + checkboxes interface with it easily [2009-08-16 11:57:30] <@EfraimDF> Chaim/Aharon: Yes, and one thing that's quite important is source gathering so we can mine everything that is relevant and available to us. [2009-08-16 11:57:42] <Azriel> moshe: thats what we plan to do [2009-08-16 11:57:58] <Azriel> moshe: A web application that will create recipes [2009-08-16 11:58:09] speaking of siddur recipes, one of the things I really love about this project is the potential for people to make mistakes. [2009-08-16 11:58:16] ? [2009-08-16 11:58:17] moshe/Azriel: If I may be so bold, that's where I come in. [2009-08-16 11:58:22] <@EfraimDF> moshe: the backend XML is intended to be processed into something that the web app can turn into radio boxes and check boxes [2009-08-16 11:58:33] <Aharon> "shgiyot mi yavin, ministarot nakenini" [2009-08-16 11:58:45] I made a similar web app for a caterer - making menus from lots of options. [2009-08-16 11:59:14] <@EfraimDF> yonah: we can't prevent people from making mistakes, since the computer won't know. One person's mistake is another one's preference. [2009-08-16 11:59:25] I can adapt what I learned there (and perhaps soem of the code) into a web app. [2009-08-16 11:59:33] I hope I didn't sound sarcastic, I really mean that. [2009-08-16 11:59:43] *** SHF quit (Ping timeout: 180 seconds) [2009-08-16 11:59:44] <Aharon> "yet who can perceive mistakes? from unknown faults preserve me" [2009-08-16 11:59:50] so it's boils down to having simple labeling for the XML, so the persons browser could process it [2009-08-16 12:00:20] <@EfraimDF> yonah: The best we can do is quality-control the text that goes in so we can claim to some high level of accuracy that it is what we say it is [2009-08-16 12:00:21] <Azriel> moshe: that would be a transform, and yes, but it gets more complex [2009-08-16 12:00:26] <Aharon> in terms of the envisioned web application's interface: [2009-08-16 12:00:32] Azriel: How? [2009-08-16 12:00:46] <Chaim> Aharon: :) [2009-08-16 12:00:52] <Aharon> variations in text between "recipes" (ie nuschaot) are going to be highlited [2009-08-16 12:01:12] I think many people are frustrated with being told for example take three steps back before tzur yisrael even if nobody in your community does that, they do that before the shemoneh esreh. I mean artscroll is an extreme example but still. [2009-08-16 12:01:55] <Aharon> i think yonah's point gets at one reason we're using XML [2009-08-16 12:01:56] EfraimDF: Maybe it would be worth requiring multiple passes? [2009-08-16 12:01:58] <Azriel> moshe: there are millions of permutations, in addition we want to support "injecting" other content to what we ourselves store, including instructions, comments, and changing part of the source text [2009-08-16 12:02:00] <Chaim> But would people care about that? I mean, it's their decision whether to or not [2009-08-16 12:02:01] <@EfraimDF> yonah: part of the beauty here is that you can have multiple sets of instructions and that the user could choose which minhag to follow. [2009-08-16 12:02:28] Azriel: But wouldn't it basicallty be "if this is checked, display these, else, Don't " [2009-08-16 12:02:29] <@EfraimDF> moshe: for which aspect? [2009-08-16 12:02:41] EfraimDF: Basically all of them [2009-08-16 12:02:54] <@EfraimDF> moshe: I meant is that a process question or a technical question? [2009-08-16 12:02:55] EfraimDF: Transcribing, translating, adding on details [2009-08-16 12:03:00] <Azriel> moshe: thats similar to saying that computers are simple, because its all in binary [2009-08-16 12:03:01] process [2009-08-16 12:03:18] <@EfraimDF> moshe: It has to be a multistep process. [2009-08-16 12:03:26] <Azriel> moshe: and you as a computer engineer major know its not like that [2009-08-16 12:03:27] <Aharon> text of the siddur that is designated in the XML as halachah/instructional text, can select to have a wide array of texts included, or none [2009-08-16 12:03:41] <Azriel> moshe: on the basic level it is [2009-08-16 12:03:43] <@EfraimDF> moshe: The first thing that happens is transcription, probably from one source [2009-08-16 12:03:49] like requiring transcribed text to be checked twice before it can move to the next step [2009-08-16 12:04:00] <Azriel> "double keying" [2009-08-16 12:04:04] <Azriel> thats the plan [2009-08-16 12:04:06] <@EfraimDF> moshe: there are 2 ways we can think of doing proofreading [2009-08-16 12:04:21] I have to go. But I'll be following up on this stuff. Azriel - talk to me about web interface stuff. [2009-08-16 12:04:29] <Azriel> I will [2009-08-16 12:04:39] <Azriel> ilancohen: we will post the logs [2009-08-16 12:04:45] <@EfraimDF> moshe: one is kind of like classical copyediting. A person checks the original source against the transcribed text and corrects errors. [2009-08-16 12:04:55] *** ilancohen quit ("Page closed") [2009-08-16 12:05:11] <@EfraimDF> moshe: the second is to have two people transcribe the same text and run a differencing algorithm against them ("double keying") [2009-08-16 12:05:11] <Chaim> But shouldn't transcribing be the least of the problems? [2009-08-16 12:05:17] <Aharon> @chaim: it's a good question: what "people" care about... if i might respond to that [2009-08-16 12:05:55] <@EfraimDF> Chaim: it's *one* problem that needs QC. Another would be checking references, for example. [2009-08-16 12:05:59] Chaim: Transcribing is the place where we could have the most errors, that wouldn't be caught for a looooong time [2009-08-16 12:06:12] As it requires the most effort to check [2009-08-16 12:06:19] <Chaim> But if something's in beta, isn't that to be expected? [2009-08-16 12:06:24] <@EfraimDF> Chaim: Another would be making sure that instructions are unambiguous. [2009-08-16 12:06:47] <@EfraimDF> Chaim: Good question, which gets a bit into the development process of texts! [2009-08-16 12:06:47] Maybe the "davening directions" could be in multiple levels ? [2009-08-16 12:06:57] <Chaim> You mean like "some congregations" [2009-08-16 12:07:36] <@EfraimDF> The proposal for development of texts so far is that all texts start out in a "contributed" section. [2009-08-16 12:07:46] <@EfraimDF> And are assumed to be not quality controlled. [2009-08-16 12:07:53] <@EfraimDF> They are sent through the QC process. [2009-08-16 12:08:05] like under the "Take three steps back". You could disable that from appearing. Or you could make it say that. Or you could open up a full description of how you're supposed to take three steps back " One foot completely behind the other" etc [2009-08-16 12:08:11] <Chaim> The problem is, most people don't know [2009-08-16 12:08:16] <@EfraimDF> When they pass all the technical checks, they may enter the "core" distribution. [2009-08-16 12:08:51] <@EfraimDF> Any breaking of technical policy in the core distribution is a bug. [2009-08-16 12:09:07] <Aharon> Personally, I would prefer that "instructional text" was also sourced from manuscript, as our other texts are [2009-08-16 12:09:25] <Aharon> and thus could also be accessed from our core archive of texts [2009-08-16 12:09:45] <@EfraimDF> "Core" does not have to be sourced from a manuscript [2009-08-16 12:09:52] <@EfraimDF> Original materials can be in core too. [2009-08-16 12:10:10] <Chaim> Aharon: But what about those that don't correlate, i.e. GR"A is Ashkenaz, but doesn't carry all Ashkenaz instructions. [2009-08-16 12:10:11] <Aharon> thanks for the clarification, yes, this is what i meant [2009-08-16 12:10:44] <@EfraimDF> Chaim: That's where providing sensible default values comes in [2009-08-16 12:11:21] <@EfraimDF> Chaim: I could see labeling instructions as "according to the minhag of..." [2009-08-16 12:11:31] <Aharon> for *all* instructions, we will need more contributors to transcribe them [2009-08-16 12:11:35] <@EfraimDF> Chaim: (the XML selection system actually has a way to handle them) [2009-08-16 12:11:49] <@EfraimDF> Chaim: (them->this) [2009-08-16 12:12:19] <@EfraimDF> There's also the language issue. Most of our audience (as of now) is first-language English [2009-08-16 12:12:38] <Azriel> moshe: the choice system is as powerful, and even more so than what you described [2009-08-16 12:12:39] <@EfraimDF> But the texts we have are first language Hebrew, Yiddish, or German [2009-08-16 12:13:13] <Aharon> a contemporary text that synthesizes and summarises the GR'A in addition to others would be welcome for sharing within the user contributed array of content shared through the Open Siddur application [2009-08-16 12:13:16] <@EfraimDF> The choice system is deceptively simple-looking. [2009-08-16 12:13:51] <Aharon> we are also anticipating texts in Amharic, Arabic, Norwegian [2009-08-16 12:13:52] <@EfraimDF> Or, even a transcription of the Vilna Gaon's works linked to the siddur texts. [2009-08-16 12:14:18] <Azriel> a similar web application to the transcription framework can be made for translations [2009-08-16 12:14:23] Wouldn't it be something along the lines of <minhag = "GRA"> etc, and only displaying those tags? (Is that even possible) ? [2009-08-16 12:14:30] *** SarahA quit ("Page closed") [2009-08-16 12:14:38] <@EfraimDF> Moshe: it's a bit more complex than that [2009-08-16 12:14:51] <@EfraimDF> Moshe: but effectively, that's what it does [2009-08-16 12:15:05] <Chaim> Are you expecting most users to be knowledgeable in Tefilla, or just beginning? Because if they're knowledgeable, they may not need so many instructions, they'll figure it out on their own. [2009-08-16 12:15:23] <Azriel> Chaim: we expect to be able to choose either one [2009-08-16 12:15:43] <@EfraimDF> Moshe: The reason for the complexity is that the same system has to handle all types of different switches [2009-08-16 12:16:11] <Aharon> a tool as powerful as the open siddur is an obvious candidate for teaching jewish liturgy to new students of the siddur [2009-08-16 12:16:37] <@EfraimDF> Moshe: Basically, a document defines the existence of a switch and documents what it does. That can be parsed by an application into radio buttons or selection boxes or other UI elements [2009-08-16 12:17:02] <@EfraimDF> Moshe: And the application generates XML that uses the appropriate selections. This file is a siddur recipe. [2009-08-16 12:17:08] EfraimDF: Understood [2009-08-16 12:17:19] <@EfraimDF> Moshe: A transform then converts it to a printing or display format. [2009-08-16 12:17:23] <Aharon> the extent to which the application serves as its own "teacher" of jewish liturgy by presenting the different parts of the siddur with access to their variants is part of our challenge [2009-08-16 12:17:37] <Chaim> Ok [2009-08-16 12:18:31] <Aharon> it is a design challenge for certain [2009-08-16 12:18:48] <@EfraimDF> I'd expect that most people's first encounter with the application will be "select a nusach", "use default values", "print" [2009-08-16 12:19:32] @EfraimDF - for the whole siddur? [2009-08-16 12:20:00] *** yitz_ (n=yitz@unaffiliated/yitz-/x-6738568) joined [2009-08-16 12:20:08] <Chaim> Yonah: Why not? It would make sense [2009-08-16 12:20:14] <Azriel> an intersting web ui development, is that fonts can now be embedded in most browsers ( eg.FF, IE, chrome) [2009-08-16 12:20:26] <Chaim> Yonah: If you don't want to take the time to personalize everything, you shouldn't have to. [2009-08-16 12:20:28] <Azriel> this shortens the toolchain [2009-08-16 12:20:30] <@EfraimDF> yonah: probably, yes. Someone who wants more detail could choose down to the level of whether Mordechai is spelled with a hataf qamatz or a sheva under the dalet [2009-08-16 12:20:51] Oh, okay, so we're talking about defaults, not possibiliies. [2009-08-16 12:20:55] *possibilities [2009-08-16 12:21:10] <Aharon> personally, i'm more interested in possibilities :) [2009-08-16 12:21:20] <Chaim> @EfraimDF: I hope someone would be able to create an account and save their preferences, so they wouldn't have to go through it every time? [2009-08-16 12:21:23] <@EfraimDF> It's something of a difference between the "first encounter with the application" and "using the application to its fullest potential" [2009-08-16 12:21:57] <Azriel> Chaim: not only that, but also to be able to share the recipes [2009-08-16 12:22:03] <@EfraimDF> Chaim: Yes, we are working on an account mechanism. Recipes can be saved and released to the public into the commons or kept local to the account. [2009-08-16 12:22:14] <Azriel> Chaim: in addition to including/sharing contributions of commentary etc. [2009-08-16 12:22:16] <Aharon> and i'm not so interested in speculating how "most" people will or won't use it. i want the Open Siddur to be a useful tool for folks who would otherwise need to use rubber glue and a binder [2009-08-16 12:22:17] I just ran through the pictures through the OCR.... more than 50% errors :( [2009-08-16 12:22:38] <@EfraimDF> Heh... I got about 90% error, so you're doing better than I did [2009-08-16 12:22:44] <Aharon> and won't likely have access to 99% of the content we'll be making available [2009-08-16 12:23:01] The trick would be to get it to be learning the font, then we'd have a lot less [2009-08-16 12:23:13] <@EfraimDF> the system is being *designed* (we hope) for maximum flexibility. [2009-08-16 12:23:20] <Aharon> Moshe: that's how Tesseract works [2009-08-16 12:23:31] Aharon: It's how most OCR's work [2009-08-16 12:23:40] you can set it to "learning mode" or the like [2009-08-16 12:24:13] <Aharon> true [2009-08-16 12:24:36] <Chaim> On a side note? [2009-08-16 12:24:39] <@EfraimDF> yes? [2009-08-16 12:24:59] <Chaim> Do you have a list of what has to be done first, second, etc. instead of having everyone doing everything at the same time? [2009-08-16 12:25:14] <@EfraimDF> There's the development status page on the wiki [2009-08-16 12:25:25] <@EfraimDF> Which, I admit, is a bit behind [2009-08-16 12:25:43] <@EfraimDF> It's probably better to block as little as possible on prerequisites [2009-08-16 12:25:52] <@EfraimDF> That way, we can each do parallel development [2009-08-16 12:25:59] <Aharon> I think a related question is, for those inclined to, what could be done now, that doesn't have a leader [2009-08-16 12:26:35] <@EfraimDF> system administration activity can certainly be shared [2009-08-16 12:26:40] <@EfraimDF> documentation improvement [2009-08-16 12:27:02] <@EfraimDF> once we have the transcription app up, transcription [2009-08-16 12:27:03] <Azriel> web app development can be shared [2009-08-16 12:27:07] <@EfraimDF> web app development [2009-08-16 12:27:24] <@EfraimDF> XML schema/validator development [2009-08-16 12:27:32] <@EfraimDF> API development [2009-08-16 12:28:13] <@EfraimDF> eventually, we'll need someone as the point-person on proofreading [2009-08-16 12:28:34] <@EfraimDF> but all the document management tasks are not really ready yet [2009-08-16 12:29:29] So for people who are illiterate in everything except really basic html (raises hand) there's no much which we can do to help at the moment - yes? [2009-08-16 12:29:36] <Aharon> yes there is [2009-08-16 12:29:48] <Aharon> i'll explain [2009-08-16 12:31:00] <Aharon> for example: i'm not coding right now, but i'm helping this project forward in a few different ways: [2009-08-16 12:31:33] <Aharon> 1) asking folks straight out for their work on the siddur [2009-08-16 12:31:52] <Aharon> 2) blogging and communicating the project [2009-08-16 12:32:16] <Aharon> 3) working on the design of the interface [2009-08-16 12:32:28] <Aharon> re: the latter [2009-08-16 12:32:42] <Aharon> we're in the design/planning phase of the Open Siddur web interface [2009-08-16 12:33:18] <Aharon> this is an essential area where non-tech folks need to contribute [2009-08-16 12:33:20] <@EfraimDF> Also, let's not forget research! Right now, we have one scanned public domain primary source which represents German Ashkenazic minhag from about the late 1860s [2009-08-16 12:33:32] <Aharon> absolutely [2009-08-16 12:33:54] <@EfraimDF> Researching (1) which siddurim are respected as emblematic of each nusach and [2009-08-16 12:34:02] <Aharon> i'm on the lookout for siddurim on our wishlist (wiki url?) [2009-08-16 12:34:02] <@EfraimDF> (2) finding copies of them and scanning them [2009-08-16 12:34:16] <Aharon> this is something that anyone with a scanner can do [2009-08-16 12:34:33] <@EfraimDF> http://wiki.jewishliturgy.org/Brainstorm_session [2009-08-16 12:34:36] <Aharon> following the scanning guidelines [2009-08-16 12:34:39] <Aharon> thanks Efraim [2009-08-16 12:34:53] <@EfraimDF> it's better if you have access to a library with a high speed scanner (it took me 4 hrs to do Avodat Yisrael) [2009-08-16 12:35:02] Thanks, this is really helpful. [2009-08-16 12:35:04] <Aharon> any text of halachah relating to the siddur we want [2009-08-16 12:35:53] <Azriel> btw, feel free to edit the wiki [2009-08-16 12:35:53] <Aharon> not to mention, translations, commentary, font-faces, and art [2009-08-16 12:36:17] <@EfraimDF> For the most part, I just don't know what are considered the respected texts in each nusach. Finding out and obtaining copies is hugely important. [2009-08-16 12:36:38] <Aharon> we simply don't have the Open Siddur social network app up yet where folks can upload, share, and craft their siddurim online [2009-08-16 12:36:50] <Aharon> but they can begin working on them now, in anticipation [2009-08-16 12:37:08] <@EfraimDF> within a few weeks, transcription should be open, though. And that will certainly open up a lot of nontechnical work [2009-08-16 12:37:17] <Aharon> there are public domain translations of the siddur in spanish, for example [2009-08-16 12:37:29] <Aharon> but we don't have anything like that scanned yet [2009-08-16 12:37:43] <@EfraimDF> There are PD english versions too. (eg, the original Singer or - I think Stern) [2009-08-16 12:37:47] <Aharon> we need to find those old siddurim with translations [2009-08-16 12:38:17] <@EfraimDF> Most of the English ones are very non-contemporary English, but they can serve as a source for updates [2009-08-16 12:38:33] <Aharon> personally, i would really love it if an artist or artist begins creating more open font's in hebrew [2009-08-16 12:38:42] Do you have anywhere a list of siddurim which should be in the public domain by now? [2009-08-16 12:38:58] <Aharon> this is a design project where an artist can also develop some technical facility of use to them in their career [2009-08-16 12:38:59] <@EfraimDF> Anything published before 1923 is in the public domain in the US. [2009-08-16 12:39:09] <Azriel> Aharon: the best place to begin would be the Ezra font we currently use [2009-08-16 12:39:30] <Azriel> Aharon: together with something like FontForge etc. [2009-08-16 12:39:34] <@EfraimDF> If it was published after 1923, then you need to do a lot of research to find out if it's PD because the laws are confusing [2009-08-16 12:39:51] <Azriel> Aharon: anyone with talent can make new open source fonts [2009-08-16 12:39:55] <@EfraimDF> Project Gutenberg and Stanford have a lot of information on that [2009-08-16 12:40:02] <Azriel> Aharon: http://en.wikipedia.org/wiki/Fontforge [2009-08-16 12:40:49] <@EfraimDF> Azriel: The Ezra SIL font has all the logic; I don't know how much would have to be changed if the glyphs change size, eg. I just don't know enough about font technical issues. [2009-08-16 12:41:05] <Aharon> that's right Azriel. The Ezra SIL font is a good place to start, and then an artist/designer could get busy making an open font which, for instance, shows the hebrew letters with their crowns [2009-08-16 12:41:36] <@EfraimDF> Actually, if someone can line-trace a (pasul?) sefer torah, you could probably make a good font :-) [2009-08-16 12:42:09] <Azriel> there is also a minor annoying issue with all of the hebrew fonts [2009-08-16 12:42:20] <Azriel> which is that the unicode must be ordered in a specific way [2009-08-16 12:42:30] <Azriel> (which isn't normally the case) [2009-08-16 12:42:54] <Azriel> if someone could fix that [2009-08-16 12:42:54] <@EfraimDF> (I think we can get into that if anyone's *really* interested in font development :-) ) [2009-08-16 12:43:03] <Aharon> how many characters would an artist/designer need to create in order to have a complete Hebrew Unicode set? [2009-08-16 12:43:34] <Azriel> I would guess simply replacing the characters that are already there [2009-08-16 12:43:48] <Azriel> though there might be sizing issues as efraim said [2009-08-16 12:44:08] <Aharon> yes, but how many is that? [2009-08-16 12:44:08] <Azriel> where the nequdos will look out of place if the character is very different from the current character [2009-08-16 12:44:09] <@EfraimDF> All the characters in http://unicode.org/charts/PDF/U0590.pdf + some additional punctuation characters [2009-08-16 12:45:27] <Azriel> not all of them have to be completely replaced [2009-08-16 12:45:35] <Azriel> they can be [2009-08-16 12:45:55] <Azriel> but a radically different font can be made by replacing the consonant [2009-08-16 12:47:20] <Aharon> i think the place to start would be one of the source masoretic texts [2009-08-16 12:47:29] <Aharon> one of manuscripts [2009-08-16 12:47:52] <@EfraimDF> Eve: if you're still there -- what type of alphabet would they be written in? [2009-08-16 12:47:53] <Aharon> and to line-trace the different letters and diacritical marks indicated there [2009-08-16 12:48:15] <@EfraimDF> Actually, we can figure it out -- the Aleppo Codex is online :-) [2009-08-16 12:48:22] <Aharon> excellent [2009-08-16 12:49:19] <@EfraimDF> www.aleppocodex.org [2009-08-16 12:50:39] <Azriel> is that all? [2009-08-16 12:50:46] <Aharon> As more volunteer contributors join the JLP/Open Siddur project, we will also need a volunteer manager. Who knows whether one of us will grow into that position, or whether someone already feels that's something they're perfectly capable of doing [2009-08-16 12:51:28] <Aharon> I will remain on the chat for anyone who has additional questions [2009-08-16 12:51:33] <@EfraimDF> what would that person's job be? [2009-08-16 12:51:41] <Aharon> I can also be contacted directly at aharon.varady@gmail.com [2009-08-16 12:51:52] <@EfraimDF> (I'm sticking around in case anyone wants to chat more) [2009-08-16 12:52:06] <Azriel> if anyone is curious about how the rendering would be done [2009-08-16 12:52:11] <Azriel> please ask [2009-08-16 12:52:11] <@EfraimDF> I think the "official" part of this chat is over -- unless there's something we haven't addressed [2009-08-16 12:52:31] <Aharon> @efraimdf: this was something recommended by Elisheva at PerlMonks for all open source projects whose capital is in volunteer in kind labor contributions [2009-08-16 12:53:17] <Aharon> a person who understands the different projects happening on the open siddur and can help orient and integrate new folks into areas they can immediately begin contributing to [2009-08-16 12:54:56] <@EfraimDF> sounds like a good idea. [2009-08-16 12:56:22] <Azriel> EfraimDF: where should I post the logs? [2009-08-16 12:56:24] <@EfraimDF> Could also be a primarily nontechnical position [2009-08-16 12:56:29] <@EfraimDF> jewishliturgy-discuss [2009-08-16 12:56:36] <Azriel> and the wiki? [2009-08-16 12:56:37] <@EfraimDF> or on the wiki [2009-08-16 12:56:47] <@EfraimDF> and send email to jl-d [2009-08-16 12:57:38] <@EfraimDF> before posting, it would be good to just write a few sentence summary of what issues we talked about [2009-08-16 12:58:23] <@EfraimDF> it shouldn't be long [2009-08-16 12:59:34] <@EfraimDF> btw Raphael did bring up a good question re:grammar encoding on the list [2009-08-16 13:00:25] <Azriel> what was that [2009-08-16 13:00:37] I'm off - thanks a lot, everyone, kol tuv. [2009-08-16 13:00:45] <@EfraimDF> at some point, when we have some useful text up, we'll probably want to start up Subproject Grammar to try to mark up... [2009-08-16 13:00:48] <@EfraimDF> bye [2009-08-16 13:01:12] <@EfraimDF> ... things like root words, accent [2009-08-16 13:01:16] <@EfraimDF> part of speech [2009-08-16 13:01:21] *** yonah quit ("Page closed") [2009-08-16 13:02:02] <Aharon> to note: I sent a request to our scholar advisors for their recommendations of seminal works we should add to our list growing at http://wiki.jewishliturgy.org/Brainstorm_session [2009-08-16 13:02:06] <Azriel> are there any technical problems with this? [2009-08-16 13:02:08] <@EfraimDF> these issues come out in text encoding, for example, in the sheva na/nach and qamats gadol/qatan decision [2009-08-16 13:02:36] <@EfraimDF> the latter is the more immediate practical/technical issue [2009-08-16 13:04:01] <@EfraimDF> eg, I know one system. There's at least one other one (probably more) in common use. They won't always agree [2009-08-16 13:05:34] <@EfraimDF> For example, ×”Ö·×žÖ°×‘Ö¹×¨Ö¸×šÖ° -- how is the sheva under the mem pronounced? [2009-08-16 13:08:20] <@EfraimDF> (anyone there?) [2009-08-16 13:08:25] <Azriel> yes [2009-08-16 13:08:26] <Azriel> heh [2009-08-16 13:08:53] <Aharon> Folks: I'd be more than happy to start up summarising development status. I'm interested in different ways of representing this sort of information. I like the structure that mozilla uses: http://blog.mozilla.com/meeting-notes/archives/213 [2009-08-16 13:09:31] <@EfraimDF> do you have other examples? [2009-08-16 13:09:38] <@EfraimDF> (That looks fine to me) [2009-08-16 13:09:52] <@EfraimDF> The basic divisions are the same as on the dev status page on the wiki [2009-08-16 13:09:53] <Aharon> i don't :( [2009-08-16 13:10:03] <Aharon> yes, interestingly! [2009-08-16 13:10:19] <@EfraimDF> (which I can remove/set to point to your updates/have you take over with updates) [2009-08-16 13:10:30] <Aharon> i like that sections still exist even if points weren't touched on [2009-08-16 13:11:13] <Azriel> this is the page in the wiki that will contain the logs: [2009-08-16 13:11:13] <Azriel> http://wiki.jewishliturgy.org/IRC_Conference/logs/2009-08-16 [2009-08-16 13:11:18] <@EfraimDF> one way to do it might be updating the dev status page on the wiki [2009-08-16 13:11:26] <@EfraimDF> Thanks, Azriel! [2009-08-16 13:11:36] <@EfraimDF> and posting the result of the update to the blog [2009-08-16 13:11:56] <Azriel> http://wiki.jewishliturgy.org/IRC_Conference [2009-08-16 13:12:02] <Azriel> is what we were supposed to discuss [2009-08-16 13:12:44] <Aharon> the history of development is obscured by editing a single page over and again. i see the importance of both - dev status updates as well as an overview of where we are now [2009-08-16 13:13:24] <@EfraimDF> ok, are you intending to manage both? [2009-08-16 13:14:07] <Aharon> yes [2009-08-16 13:14:18] <Aharon> ok? [2009-08-16 13:14:18] <Chaim> I have to go, but Yosher Koach everyone! [2009-08-16 13:14:30] <@EfraimDF> thanks for participating, Chaim. [2009-08-16 13:14:37] <Aharon> thanks Chaim. i'm sorry i missed out on your introduction [2009-08-16 13:14:44] <Chaim> Thank _you_ [2009-08-16 13:14:57] <@EfraimDF> Aharon: ok with me [2009-08-16 13:15:05] <@EfraimDF> Aharon: I haven't been very good at it :-) [2009-08-16 13:15:21] *** Chaim (i=44c0d935@gateway/web/freenode/x-urfjkwqzapwffvpv) left [2009-08-16 13:16:56] <Aharon> I'm very curious about the folk who conferenced with us today. Did anyone not introduce themselves yet? [2009-08-16 13:17:25] <@EfraimDF> I think moshe missed the intros [2009-08-16 13:17:31] <Azriel> moshe is a friend of mine [2009-08-16 13:17:42] <Azriel> and I suspect yitz_ is as well [2009-08-16 13:18:10] <Azriel> if he is, he is the one who has volunteered to try using tesseract [2009-08-16 13:18:15] <yitz_> Nope [2009-08-16 13:18:28] <yitz_> I was pointed here by someone on irchighway [2009-08-16 13:18:32] <yitz_> Did I miss intros? [2009-08-16 13:18:36] <@EfraimDF> yes [2009-08-16 13:18:40] * yitz_ is a professional lurker [2009-08-16 13:18:51] <Aharon> welcome [2009-08-16 13:18:54] <yitz_> Thanks [2009-08-16 13:18:57] <@EfraimDF> They were at the beginning of the conference. welcome. [2009-08-16 13:19:14] <yitz_> I joined this room ~60 minutes ago [2009-08-16 13:19:26] <yitz_> 59 [2009-08-16 13:19:33] <yitz_> I think I missed the beginning [2009-08-16 13:19:41] <@EfraimDF> yes. [2009-08-16 13:19:53] <Aharon> Yitz, if this interests you, please visit http://opensiddur.net/join-us/ [2009-08-16 13:20:06] * yitz_ is a Canadian Computer Engineer who has the odd habit of programming as a hobby but without much spare time on his hands [2009-08-16 13:20:39] <Azriel> heh [2009-08-16 13:20:41] <Azriel> who does [2009-08-16 13:20:43] <Azriel> ;) [2009-08-16 13:20:47] <Aharon> i'd love it if while we're coding, we could imagine some mini-projects that folk like Yitz might want to adopt [2009-08-16 13:21:15] <@EfraimDF> I occasionally post those to the ML. [2009-08-16 13:21:21] <Azriel> more importantly, design advice can help [2009-08-16 13:21:37] <Aharon> efraim and azriel, what are your thoughts on ways to engage coders like yitz. [2009-08-16 13:21:50] <@EfraimDF> None of them are particularly exciting stuff (yet), but things like working on the db-independent build system [2009-08-16 13:22:24] <@EfraimDF> or writing test suites [2009-08-16 13:22:32] <@EfraimDF> for just about any aspect of what we do [2009-08-16 13:23:01] <Aharon> what point do we need t oget to before we can take a document like the one Reb Zalman contributed, and begin to encode it? [2009-08-16 13:23:26] <Aharon> or for that matter, one of the siddurim from wikisource [2009-08-16 13:23:30] <@EfraimDF> the encoding interface. [2009-08-16 13:23:50] <@EfraimDF> as for the siddurim from wikisource, I would really like to handle that slightly differently from just a data dump. [2009-08-16 13:24:07] <@EfraimDF> They tell you what their sources are. [2009-08-16 13:24:29] <@EfraimDF> I'd like to get copies of the sources and handle them as regular transcribed text *for the ones we can* [2009-08-16 13:24:59] <@EfraimDF> eg, one of them says it's copied mostly from Rinat Yisrael, and makes the questionable claim that all siddur text is PD. [2009-08-16 13:25:00] <Aharon> so our biggest hurdle is finding the text sources from which they were derived... and scan them [2009-08-16 13:25:06] <@EfraimDF> yes. [2009-08-16 13:25:24] <@EfraimDF> As you saw from Raphael, that claim will be challenged. [2009-08-16 13:25:31] <Aharon> Yonah has graciously offered to do some research for us and asked which text she should find [2009-08-16 13:25:56] <@EfraimDF> The list on the wiki is a good place to start [2009-08-16 13:26:05] <@EfraimDF> I really want to get my hands on all of those texts [2009-08-16 13:26:08] <Aharon> yes that was my thought as well [2009-08-16 13:26:10] <@EfraimDF> (or, my digital hands) [2009-08-16 13:26:17] <Aharon> i don;t think Rinat Yisrael is on that list yet... [2009-08-16 13:26:36] <@EfraimDF> as for nusach edot hamizrach, we need actual research to figure out which texts they respect [2009-08-16 13:26:40] <Aharon> her resource is http://discover.library.utoronto.ca/resources-research/other-library-catalogues [2009-08-16 13:26:42] <Aharon> oops [2009-08-16 13:26:43] <@EfraimDF> RY can't be. It's copyrighted [2009-08-16 13:26:58] <Aharon> rather: http://www.library.utoronto.ca/home/ [2009-08-16 13:27:10] <@EfraimDF> U Toronto has a good library [2009-08-16 13:27:34] <Aharon> so i think if we point her to a call number, she can get it [2009-08-16 13:27:57] <Aharon> same goes for me too, at http://www.franklin.library.upenn.edu/ [2009-08-16 13:27:57] <@EfraimDF> They don't use LC call numbers over there, do they? [2009-08-16 13:29:48] <@EfraimDF> here's one: [2009-08-16 13:29:50] <@EfraimDF> http://discovery.lib.harvard.edu/?hreciid=%7clibrary%2fm%2faleph%7c002267021 [2009-08-16 13:30:37] <Aharon> i'd like to for example: http://search1.library.utoronto.ca/UTL/index?N=0&Nr=p_catalog_code:948782&showDetail=first [2009-08-16 13:31:04] <Aharon> looks interesting for our studies of the Rav Saadya Gaon siddur [2009-08-16 13:31:10] <Aharon> dates to 1903 [2009-08-16 13:31:13] <@EfraimDF> Is it Hebrew or Arabic? [2009-08-16 13:31:27] <@EfraimDF> nm... written there [2009-08-16 13:31:54] <Aharon> hebrew... might not have anything to do with the siddur [2009-08-16 13:32:10] *** moshe quit ("Page closed") [2009-08-16 13:32:24] * yitz_ might have U Toronto lib access [2009-08-16 13:33:27] <@EfraimDF> I have Harvard Library access, but it takes a lot of time to scan stuff [2009-08-16 13:33:39] <@EfraimDF> Does UT library (or UPenn) have a book scanner? [2009-08-16 13:34:10] <yitz_> Not that I know of (UT) [2009-08-16 13:34:24] <Aharon> don't know. i always go with my own scanner [2009-08-16 13:34:43] <Aharon> Here is the Rav Amram siddur we were talking about a hlaf year ago: http://search1.library.utoronto.ca/UTL/index?N=0&Nr=p_catalog_code:3348920&showDetail=first [2009-08-16 13:36:06] <@EfraimDF> it's worth checking. book scanners are *much* faster and won't damage binding [2009-08-16 13:39:42] <Aharon> i'm not familiar with any libraries that have the book scanners you're describing [2009-08-16 13:40:21] <@EfraimDF> Harvard has one. I thought others might too [2009-08-16 13:41:28] * yitz_ is off to lunch [2009-08-16 13:41:49] <yitz_> Feel free to nick/pm me anything for later pickup [2009-08-16 13:42:00] <Azriel> yitz_: we will post the logs [2009-08-16 13:42:03] <Azriel> on our wiki [2009-08-16 13:42:10] <Azriel> and mail it out to the list [2009-08-16 13:42:22] <Azriel> EfraimDF: remind me again, is hebrewbooks PD? [2009-08-16 13:42:56] <Aharon> btw, I count 89 symbols (sygils?) for a prospective font designer/artist to support in creating a unicode 5.0 font-face [2009-08-16 13:42:56] <@EfraimDF> some of it [2009-08-16 13:43:05] <@EfraimDF> their scans tend to be bad quality [2009-08-16 13:43:19] <@EfraimDF> glyphs is the word you're looking for [2009-08-16 13:44:22] <@EfraimDF> I can usually read hebrewbooks.org scans. I couldn't transcribe from most of them [2009-08-16 13:45:43] <Azriel> http://www.hebrewbooks.org/7284 ? [2009-08-16 13:45:53] <Azriel> u might have pointed this out b4 [2009-08-16 13:46:50] <@EfraimDF> yes, I downloaded a copy of it :-) [2009-08-16 13:46:58] <@EfraimDF> (part I also) [2009-08-16 13:47:40] <Aharon> isn't there an issue with using their scan? [2009-08-16 13:47:43] <@EfraimDF> that's the Frumkin version of Seder Rav Amram with a 1912-Ashkenazic siddur tagging along with it. [2009-08-16 13:47:50] <@EfraimDF> The scan is horrible quality [2009-08-16 13:48:12] <Aharon> ie., isn't their scan copyrighted even if the work scanned is PD? [2009-08-16 13:48:24] <@EfraimDF> They may claim copyright over the scan [2009-08-16 13:48:37] <Aharon> Yonah could make us a nicer scan [2009-08-16 13:48:38] <@EfraimDF> (or, rather, whoever scanned it may claim copyright over the scan) [2009-08-16 13:48:59] <@EfraimDF> it's actually a lower priority, I'd say, than Siddur Torah Or [2009-08-16 13:49:02] <Aharon> so effectually, it needs to be scanned by Open Siddur or an ally [2009-08-16 13:49:15] <Aharon> *effectively [2009-08-16 13:49:20] <Azriel> wikipidia claims the hebrewbooks is all PD [2009-08-16 13:49:25] <@EfraimDF> yes, copyright notwithstanding anyway. [2009-08-16 13:49:29] <@EfraimDF> I think Wikipedia is wrong. [2009-08-16 13:49:30] <Aharon> hmm [2009-08-16 13:49:39] <@EfraimDF> Some of the books are too recent to be PD [2009-08-16 13:49:41] <Azriel> the best way to know [2009-08-16 13:49:44] <Azriel> is to email them [2009-08-16 13:49:50] <Azriel> and get confirmation [2009-08-16 13:49:59] <@EfraimDF> yes, we can email them. Either way, they're *very* hard to transcribe from. [2009-08-16 13:50:04] <Azriel> because its likely that among there 25000(!) scans [2009-08-16 13:50:13] <Azriel> they have some good resources [2009-08-16 13:50:20] <Azriel> if not only for commentary etc. [2009-08-16 13:50:34] <Azriel> and [2009-08-16 13:50:40] <Azriel> using smart keywords [2009-08-16 13:51:00] <Azriel> we can try to search the one that were successfuly OCR'd [2009-08-16 13:51:07] <Azriel> (they are rated) [2009-08-16 13:51:24] <Azriel> and find *any* siddur that they have [2009-08-16 13:51:34] <@EfraimDF> It is better than nothing. [2009-08-16 13:52:19] <Aharon> http://data.hebrewbooks.org/forms/feedbackform.html [2009-08-16 13:53:33] <@EfraimDF> Go ahead and contact them. I just want to make sure we are extremely careful at every step we take [2009-08-16 13:53:36] <@EfraimDF> We are being watched :-) [2009-08-16 13:53:57] <@EfraimDF> (from two directions) [2009-08-16 13:54:22] <Aharon> this is a good question for our legal team [2009-08-16 13:54:55] <@EfraimDF> what's the question? [2009-08-16 13:56:12] <Aharon> are we permitted to transcribe texts from digital scans of Public Domain works when those scanned images may be copyrighted? [2009-08-16 13:57:39] <Aharon> to hebrewbooks my question will be: what is the extent of my permission to use a scan of a PD text that you have in your archive? [2009-08-16 13:59:11] <Aharon> alternately: i would like to use scan of PD texts in your archive for a transcription project, the result of which will be CC-BY or CC-BY-SA digitized text. Do you have any objections? [2009-08-16 13:59:35] <Aharon> i'm just thinking i'm not asking the right question [2009-08-16 13:59:49] <Azriel> CC0 [2009-08-16 14:00:54] <@EfraimDF> Aharon -- answer is yes [2009-08-16 14:01:19] <@EfraimDF> actually, the results from a PD original source should be CC0 [2009-08-16 14:01:54] <@EfraimDF> the haggadah was transcribed from a copyrighted scan, eg [2009-08-16 14:02:33] <@EfraimDF> on the other hand, that makes it *very* hard to prove a case should one ever come up [2009-08-16 14:02:44] <@EfraimDF> and it makes proofreading/error correcting extremely difficult [2009-08-16 14:02:51] <Aharon> the issue for JLP is whether the copyright scans could be redistributed, in a sense, on a transcription engine we're hosting [2009-08-16 14:02:54] <Aharon> correct? [2009-08-16 14:03:16] <@EfraimDF> if the scan truly gets a new copyright (that's the legal q), they can't be [2009-08-16 14:04:51] <@EfraimDF> a similar question is scans of facsimile editions. Project Gutenberg decided that the could transcribe off facsimile eds. [2009-08-16 14:04:51] <Azriel> hmm [2009-08-16 14:04:59] <Aharon> ok, perhaps our question is different: [2009-08-16 14:05:15] <@EfraimDF> but there the content is certainly still PD. They don't keep copies of their originals [2009-08-16 14:05:24] <Azriel> this reminds me, if anyone has a computer without hebrew fonts installed [2009-08-16 14:05:28] <Aharon> is a transcription of a scan of PD text considered to be a derivative work of that scan? [2009-08-16 14:05:30] <Azriel> please check out the wiki [2009-08-16 14:05:40] <Azriel> and see if you can read the hebrew texts [2009-08-16 14:05:41] <@EfraimDF> a transcription of a PD text is PD [2009-08-16 14:05:54] <@EfraimDF> it makes no difference what it was transcribed from. [2009-08-16 14:06:36] <Aharon> then the question to hebrew books for our project would be: can we use the scanned image of one of your books for a collaborative transcription project. [2009-08-16 14:06:37] <@EfraimDF> I linked to a number of pages on this topic over the ML [2009-08-16 14:07:01] <@EfraimDF> and, can we copy the scan over to our project, modify it (get into page by page format), and redistribute it [2009-08-16 14:07:34] <@EfraimDF> for any pre-1923 book they have, we don't need permission to transcribe it [2009-08-16 14:07:59] <Aharon> as explained by Project Guttenburg [2009-08-16 14:08:38] <@EfraimDF> eg, we can't be held to "personal use only" restrictions [2009-08-16 14:11:31] <Aharon> doesn't PD only apply pre-1923 for texts published in the U.S.? [2009-08-16 14:13:02] <@EfraimDF> It depends. I think there are rare cases when something can be pre-1923 and non-PD in a non-US country [2009-08-16 14:13:32] <@EfraimDF> most of the copyright terms have been harmonized (lechumra) [2009-08-16 14:14:04] <@EfraimDF> http://www.copyright.cornell.edu/resources/publicdomain.cfm [2009-08-16 14:17:09] <@EfraimDF> http://onlinebooks.library.upenn.edu/c-fineprint.html