This is Bolderbast.Inducks.org, part of the inducks.org website.

The art of batch indexing

This section is about "Batch indexing". In ancient times it was just called "Indexing", but nowadays there is also the on-line indexing feature on COA.

Batch indexing may look more complicated than it is. I'll try to give a short introducktion to the "art of indexing" here. More detailed descriptions are on this page and the pages after that.

Let me start by giving an example. A Danish comic was indexed as follows:
AA1999-01    h3 [issdate:1999-01-07] [price:16.50] [pages:64] [inx:ORN]
AA1999-01a  D 14906        1 c                  DD  [desc:saws hole for ice-fishing - so does sawfish] [xapp:DD]
AA1999-01b  YD 50-04-28    0q           AT      DD  [desc:wants to kiss the bride] [xapp:DD]
AA1999-01c  D 98078       12        ->  JCF     DD  Nødhjælp [writ:PMG,CMG] [xapp:DD,HDL]
AA1999-01d  D 97317        1        PBl PAC     DD  Kulinarisk kollision [xapp:DD]
AA1999-01e  D 97195        6        JSu TSa     DD  Spøgelseshuset [xapp:DD,HDL]
AA1999-01f  ZM 49-11-13    1        BWa MGo     MM  Elefantastisk sprogtalent [xapp:GO,MM]
AA1999-01g  D 97563        4        PHe JMl     MIM Et hundeliv [xapp:MIM]
AA1999-01h  D 97363        6    LJe CSp AUz     US  Et glimt af fremtiden [xapp:QF,GY,US,Chip Gearloose]
AA1999-01i  D 97241       10        ->  JCF     DD  Skytsengel på glatis [writ:Per Olsen] [xapp:HDL,DD,US,DA,Jones,GD]
AA1999-01k  ZM 49-02-13    1        BWa MGo     MM  Startkapital [xapp:MM,GO]
AA1999-01l  D 94138        8        SPt CFe     MM  Musefælden [part:1] [xapp:CL,GO,MI,MM,PL,HH]
AA1999-01m                 1 g                  GY  [puzzle] [desc:plays trumpet wearing earmuffs]
AA1999-01n  ZM 41-08-10    1                    MM  Dårligt varsel [xapp:GO,MM]

In the above, you see the following:

1. One line with "header" information, listing the issue number (AA1999-01 = Anders And #1, 1999), publication date, price, number of pages, and the indexer of the issue (ORN).

Then several lines with the contents of the comic: covers, stories, illustrations.

2. Every line starts with the issue code (AA1999-01) and a sequence letter ('a' to 'n').

3. A story code. This is the most important data on a line. It links the issue data with other data in the database. For instance, the cover of the comic is coded "D 14906". Elsewhere we already stored information about the cover (data from other sources), like its main character, English description, appearing characters, etc.
We use story codes that look a lot like the printed codes in the comic. But there are some differences (see for instance our way of writing newspaper dates in "ZM 49-11-13").

4. Page count and page layout. "0q" = 1/4 page ("zero and a quarter"). "1 c" = one page, a cover. "1 i" = one page, an illustration.

5. Plot credit of a story (if not the same as the writer). In the example above, only LJe (for AA1999-01h).

6. Writer of the story.

7. Artist of the story.

8. Inker of the story (if the artist did not ink as well). For these four credits, we use abbreviations. Some less-frequently appearing people don't have an abbreviation. So we list "->" in the column, and list the full name somewhere behind the title. For instance [writ:Per Olsen]. It's also possible that a credit does not fit in the column, because two people worked together. In the example: [writ:PMG,CMG].
The abbreviations to use are [plot:] [writ:] [art:] [ink:].

9. Title hero of the story. For Disney characters, we also have abbreviations (DD = Donald Duck etc.). These are mostly based on the characters' English names.

10. The title of the story. In the example above: Danish titles.

11. Extra information, like a full list of appearing characters in the story [xapp:...], an English description [desc:...], or an indication of the part of the story [part:1].

Now this all is very elaborate, and lots of data is given that is not necessary per se. If the story codes are known in the database, other info (like credits, appearing characters) may also already be known. The minimum amount of information we need for an issue is, for instance, as follows:
AA1999-01    h3 [issdate:1999-01-07] [inx:ORN]
AA1999-01a  D 14906        1 c                  DD
AA1999-01b  YD 50-04-28    0q                   DD
AA1999-01c  D 98078       12                    DD  Nødhjælp
AA1999-01d  D 97317        1                    DD  Kulinarisk kollision
AA1999-01e  D 97195        6                    DD  Spøgelseshuset
AA1999-01f  ZM 49-11-13    1                    MM  Elefantastisk sprogtalent
AA1999-01g  D 97563        4                    MIM Et hundeliv
AA1999-01h  D 97363        6                    US  Et glimt af fremtiden
AA1999-01i  D 97241       10                    DD  Skytsengel på glatis
AA1999-01k  ZM 49-02-13    1                    MM  Startkapital
AA1999-01l  D 94138        8                    MM  Musefælden [part:1]
AA1999-01m                 1 g                  GY  [puzzle] [desc:plays trumpet wearing earmuffs]
AA1999-01n  ZM 41-08-10    1                    MM  Dårligt varsel

This is the same issue as in the example above, but with a lot less info. Note that all the data for the puzzle is still there (AA1999-01m), since that entry has no valid story code. There will be a lot of cases where story codes are not given in the comic itself. In those cases, it will be useful to list the appearing characters, a description, etc. So we can recognise the story, and find the right story codes later.

So you could start making indexes like the second example. If you're still interested, please contact me (h dot w dot fluks at wxs dot nl) so we can discuss things a bit further before you start!

For more information, choose a link on the how-to-index page.
   
<< Previous page (This page was generated by AweGen 5.3 on 2016-07-06) Next page >>