
FILE: norwegian.words
VERSION: DEC-SRC-92-Apr-05

EDITOR

    Jorge Stolfi <stolfi@src.dec.com>
    DEC Systems Research Center
  
AUTHOR OF ORIGINAL WORDLIST

    Anders Ellefsrud <anders@ifi.uio.no>

DESCRIPTION

    The file norwegian.words is a list of over 60,000 Norwegian words.

    The file has one word per line, and is sorted with sort(1)
    in plain ASCII collating sequence.

    The file apparently does not contain any proper nouns.
    It is supposed to contain all word inflections and verb
    tensens, but it is still extremely incomplete (as one can deduce
    from its size).

    Umlauts and circle-accents are respectively denoted by a double
    quote (") and at-sign (@) after the modified vowel ("a" or "o").
    Besides the letters [a-z], the file uses only double quotes,
    at-sign, and newline.

AUXILIARY LISTS

    In the same directory as norwegian.words you will find the 
    following file:

    norwegian.trash

        A list of 5 words (mostly proper names) from the original 
        wordlist that were intentinally removed from norwegian.words.  

ORIGINAL LISTS 

    The original wordlist from which those file was compiled is listed
    below.  It was obtained by anonymous FTP on 92-Feb-10.

    [1] from: relay.cs.toronto.edu : /doc/Dictionaries
        file: words.norwegian.Z
        size: 258162 bytes (589234 bytes ncompressed)
        author: Anders Ellefsrud <anders@ifi.uio.no>

    COMMENT: The list [1] contains mainly lowercase words (only 4 or 5
    exceptions). It uses the characters {}| to represent the Norwegian
    special letters "ae", "aa" "oe".
    
    The author of the list says that it contains many errors,
    and that a better one is being prepared.

COMPILATION PROCESS

    The file norwegian.words is basically the file words.norwegian
    [1], with uppercase words removed to words.trash, and the
    characters {}| expanded into the letter-accent pairs: 
    a" (ae, a-umlaut), a@ (aa, a-circle), and o" (oe, o-umlaut).

(NON-)COPYRIGHT STATUS

  To the best of my knowledge, all the files I used to build these
  wordlists were available for public distribution and use, at least
  for non-commercial purposes.  I have confirmed this assumption with
  the authors of the lists, whenever they were known.
  
  Therefore, it is safe to assume that the wordlists in this package
  can also be freely copied, distributed, modified, and used for
  personal, educational, and research purposes.  (Use of these files in
  commercial products may require written permission from DEC and/or
  the authors of the original lists.)
  
  Whenever you distribute any of these wordlists, please distribute
  also the accompanying README file.  If you distribute a modified
  copy of one of these wordlists, please include the original README
  file with a note explaining your modifications.  Your users will
  surely appreciate that.

(NO-)WARRANTY DISCLAIMER

  These files, like the original wordlists on which they are based,
  are still very incomplete, uneven, and inconsitent, and probably
  contain many errors.  They are offered "as is" without any warranty
  of correctness or fitness for any particular purpose.  Neither I nor
  my employer can be held responsible for any losses or damages that
  may result from their use.

