Greymatter: Logware
home

Greymatter
Weblog/Journal Software . Version 1.7.2 . Software Developers Guide
Copyright (c) 2000-2007 The Greymatter Team . All Rights Reserved

Purpose

These are documents outside of the code to help create a 'big picture'. Note that code comments should supersede any documentation here. This document contains file layout to document what the files hold (information-wise). Api for the way to get this information should appear in the Perl Module. For example, look at the libs/Gm_Storage.pm file and you will see each public method with documentation on what arguments it takes, what it returns, how to use it, and what (if anything) it deprecates.

Developer rules/guidelines/style guide

Note that I don't really care about style and format of the code, rather, I care that there are comments, I care that the api is used and not worked around.

  • Strict. Always, always, always 'use Strict' in any new module. Perl strict will catch countless careless mistakes (we've all made them) and no brainers that its just harder to develop without it.
  • Warn. 'use warnings'. Similar reasons as above, but it encourages better code writing in that it reminds you to dot your i's and cross your t's.
  • Constants. Try to use constants wherever you can, specifically, use values in Gm_Constants. Several defects have been tracked down due to mis-typings of constant values/flags (for example ' open' vs 'open' and 'template' vs 'template').
  • Handlers. When appropriate take a 'handler' as a subroutine argument. A handler is simply a reference to a subroutine. This handler can then be determined by the calling subroutine (for example, if a user page (such as comments.cgi) calls a function, any errors should show the User Template, not the Admin look and feel).
  • Newlines. Unless printing directly to the screen, avoid newlines. Subroutines that save data should handle newlines if necessary. For example, Gm_Store::addLogMessage adds a newline because its working on a flat file. But if it was stored a database, it wouldn't make sense to have the newline at the end of the control panel message. So when addLogMessage is called, it takes data that is 'persistent storage neutral'. Let the Storage subroutines worry about newlines and flat-file storage. When we switch to a database, then we won't have to hunt through the code getting rid of newline characters.
  • Printing. Return a string rather than printing, this is just more elegant. Leave prints to the calling subroutine if possible.
  • Encapsulation. Private subroutines should start with the '_' character. By private I mean it will never be called outside of this package.
  • Arguments. If you have more than 1 or 2 parameters, especially if they are not required, use named parameters such as in createRadioButton. By putting stuff in a hash we gain the flexibility to add more optional parameters without having to pass in '' placeholders or modify existing code.
  • Quotes. Use ' and " where appropriate. If you don't have any variables or newlines then use ', its quicker and cleaner.
  • Strings. Don't put all text on one huge long line. It messes with some programs that don't do line wraps well (some cvs, some text editors, etc.). Its also makes reading through the code difficult since people have to scroll down AND over.
  • Readability. Make it easy to read for a human, so break things up onto multiple lines if the line of code scrolls off screen. This also applies to perl'isms such as when in a subroutine calling 'shift' to get the arguments passed in, to make it more readable, explicitly say 'shift(@_)'. Perl has many such hidden variables, always use the name and explicitly state them.

Comments

Many people hate to comment, but they are useful not only to others but to yourself. I am not going to go through and state all the reasons for commenting. I would like to say _how_ to comment.

Do not say 'what' something is doing, unless its not obvious (and that maybe a sign itself), regular expressions are the exception of course. However, for comments on a subroutine to be useful (and to ensure that subroutine is found and reused, because, if people don't know what your subroutine does, how likely are they to reuse it?), always state the arguments it takes (such as 'ARG1', 'ARG2' for lists or 'ARG errHandler' for named parameters) and what the subroutine returns.

It is also nice to include a summary of what it does, if it isn't painfully obvious already.

Roadmap

This is a road map of what I would like to do with the code and where I am moving stuff to (i.e. the method to my madness).

The files
While having everything in gm-library.cgi has its advantages in that its easy to upgrade and install, the tradeoffs are too great. I was amazed that the file has 12,000+ lines! Such a file is hard to navigate it harder to see what might be duplicating what. By breaking this file into sections and refactoring the code to use scoping and encapsulation, I think many of benefits will be achieved. The libs directory will be the new repository of the code, with no configuration or data persistence in the libs, just the library files (perl modules).

The library Gm_Core will contain all of the essential functions of GreyMatter such as applying the templates, formatting an entry, etc. Gm_Web will contain methods that relate primarily with communicating with the user, such as displaying screens and forms. Gm_Utils will contain the small miscellaneous functions that will streamline other code, such as checking for hacks in a string and formatting the date.

Gm_Storage is the location of all data persistence mechanisms, by which I mean flat file reading or if it was written, database access. The idea is that the rest of GreyMatter doesn't care _how_ the data is stored, as long as its returned in a consistent manner. Gm_Upgrade focuses on upgrading from one version to another, without cluttering up the other code, particularly Gm_Storage since upgrading does need to know how the data is stored. Gm_Constants is just a file of constants that are used to avoid careless mistakes such as 'Yes' vs 'yes', or testing for an empty string '', but actually looking for a string with a space ' '.

Static vs Dynamic
As noted with Gm_Storage I am trying to remove GreyMatter's dependancy on flat-files so that if someone wants to use a database, its at least possible if the Gm_Storage library is rewritten. However, GreyMatter also generates static content through rebuilds, comments, etc. For now, I think this is an advantage in that its quicker and lighter (server loads) that dynamically generating each page. The current GreyMatter is even flexible enough so that PHP pages could be generated instead of HTML, through the configs and templates.

Admin Issues
Until I learned more about css, I did everything with tables for appearance in pages such as doing borders and such, but with css, we not only get more flexibility but the html is also easier to read when working on the code. The current admin screens are very table heavy and I would like to redo them to keep the same style but by using div tags. At the same time I would also like to redo the navigation of the admin screen so that there is less clicking around. I am planning a constant menu on every page that would allow the user to at least jump around to the major sections. And lastly, a help feature would be useful. Simply having a question icon next to words and fields, that would link to the gm_manual (popup perhaps?).

Templates (gm-templates.cgi)

    The template file is now order independent, meaning that there is no special order to the templates stored within. This is because the format of the template is now:
    template_name=template_value
    One template per line, each line beginning with the name of the template.
  • Naming. Template names should always end with the word 'template', as this will make it obvious that its a template and easy to pull out of a hash (form submission for example will contain templates and other values).
  • Todo: List template variables and what they are used for...

Configs (gm-configs.cgi)

    The config file is now order independent, meaning that there is no special order to the configuration variables stored within. This is because the format of the config is now:
    config_name=config_value
    One config per line, each line beginning with the name of the config.
  • Naming. Config names should always start with the word 'gm', as this will make it obvious that its a GreyMatter config and easy to pull out of a hash (form submission for example will contain configs and other values).
  • Todo: List config variables and what they are used for...

Counter (gm-counter.cgi)

    The counter file is now order independent, meaning that there is no special order to the counter values stored within. This is because the format of the counter is now:
    counter_name=counter_value
    One counter per line, each line beginning with the name of the counter variable.
  • Naming. Counter variables don't follow as precise a pattern as configs and templates. Rather the counter name attempts to be descriptive, while not providing redundant information.
  • The counter variables and their meaning:
    entrytotal = total number of entries posted
    archivetotal = total number of entries not on front page
    stayattopentry = entry# designated as "Stay At Top", marked 0 if there's no such thing
    karmapos = total positive karma votes
    karmaneg = total negative karma votes
    commenttotal = total comments posted
    opentotal = total number of open entries
    closedtotal = total number of closed entries

Entrylist (gm-entrylist.cgi)

    The entrylist file is now order independent, meaning that there is no special order to the entrylist lines within. The order of the items isn't needed for any of the current functionality provided by GreyMatter and here is an example of sorting by entry number:
    my $gmentrylist = Gm_Storage::getEntrylist( errHandler=>\&Gm_Web::displayAdminErrorExit );

    foreach my $entry ( sort { $gmentrylist->{$b}{'id'} <=> $gmentrylist->{$a}{'id'} } keys( %$gmentrylist ) ) {
    ...
  • One counter per line, each line containing the following values separated by the '|' character:
    1. id = the numerical id of the entry (key of returned hash)(usually order in which entered, never 0)
    2. author = entry author's name (must be alphanumeric)
    3. subject = entry subject (must be alphanumeric)
    4. created = CREATE Date of entry in the format of mm/dd/yy (does include leading zeros)
    5. createt = CREATE Time of entry in the form of hh:mm [AM/PM] (does include leading zeros)
    6. status = entry status: open/closed, either O or C
    7. extended = is this an extended entry, either Y or N
    8. music = current music of entry (well author really)
    9. mood = current mood of entry (well author really)
    10. emoticons = are emoticons enabled, yes or no
  • Naming. Counter variables don't follow as precise a pattern as configs and templates. Rather the counter name attempts to be descriptive, while not providing redundant information.

Authors (gm-authors.cgi)

    The authors file is now order independent, meaning that there is no special order to the author information within. The order of the items isn't needed for any of the current functionality provided by GreyMatter and here is an example of sorting by author name alphabetically:
    my $gmauthors = Gm_Storage::getAuthors( errHandler=>\&Gm_Web::displayAdminErrorExit );

    foreach my $author ( sort { $gmauthors->{$a}{'author'} cmp $gmauthors->{$b}{'author'} } keys( %$gmauthors ) ) {
    ...
  • One author per line, each line containing the following values separated by the '|' character:
    1. author = author's name (key of returned hash (case sensitive))
    2. password = author's password (crypted)
    3. email = author's email
    4. homepage = author's homepage
    5. created = CREATE Date of the author
    6. posttotal = total number of postings by this author
    7. postnew = can this author make new posts Y or N
    8. editentries = can this author edit entries Y or N
    9. editconfigs = can this author edit configs Y or N
    10. edittemplates = can this author edit templates Y or N
    11. editauthors = can this author edit other authors Y or N
    12. rebuild = can this author rebuild files Y or N
    13. viewcplog = can this author view the control panel Y or N
    14. bookmarklets = can this author use bookmarklets Y or N
    15. upload = can this author upload files Y or N
    16. viewadmin = can this author access the admin scene (gm.cgi) Y or N
  • Naming. Authors don't follow as precise a pattern as configs and templates. Rather the author variable name attempts to be descriptive, while not providing redundant information.

Entry (00000001.cgi and up)

    The entry file remains unchanged. Note that the entry information is contained within the cgi file and it is generated to the html files (by default, the file type can be different than cgi). This information was culled from the old greymatterforums site, originally contributed by Flipped Cracker (Robert). The layout of the file is much more complex then the other files. The first 4 or 5 lines give the majority of the information about a particular entry with comments appearing after the 4th line:
    1. line, information about the post/entry
    2. line, Karma-related information. The IP addresses and the votes associated with those IP addresses are collated here.
    3. line, the "main text" of each entry.
    4. line, the "extended text" of each entry. If none, a blank line is left.
    5. line (and more if necessary), comments. One comment per line.
  • The entry information is stored in the first line, with the following information separated by the '|' character:
    1. author = author's name (key of returned hash (case sensitive))
    2. id = numeric, never 0
    3. author = alphanumeric
    4. subject = title of post alphanumeric (we hope)
    5. weekday = numeric (0-6, 0=Sunday, 1=Monday, etc.)
    6. month = month of post numeric (1-12, no leading zeroes)
    7. day = day of post numeric (1-31, no leading zeroes)
    8. year = year of post numeric (format: yyyy)
    9. hour = hour of post numeric (1-12, no leading zeroes)
    10. minute = minute of post numeric (1-12, no leading zeroes)
    11. second = second of post numeric (1-12, no leading zeroes)
    12. ampm = either AM or PM
    13. karmapos = positive karma numeric
    14. karmaneg = negative karma numeric
    15. commenttotal = number of comments numeric; 0 if no comments
    16. karma = votes allowed yes/no
    17. comments = comments allowed yes/no
    18. status = entry open or closed open/closed
    19. music = current music of entry (well author really)
    20. mood = current mood of entry (well author really)
    21. emoticons = are emoticons enabled, yes or no
  • The karma votes is stored in the second line, with the following information separated by the '|' character (note that each entry has the default line '0.0.0.0|I'):
    1. ip = the ip that cast the karma vote
    2. vote = the karma vote, either a P for positive or N for negative
  • The main text of the entry is stored in the third line, with single line breaks replaced with '|*|' and double line breaks with '|*||*|'. All the text is presented as one line.
  • The extended text of the entry is stored in the fourth line, following the same text conventions of line 3.
  • The comments are stored in the fifth line and beyond, with the following information separated by the '|' character:
    1. name = commenter's name alphanumeric
    2. ip = in the form of xxx.xxx.xxx.xxx
    3. email = in the form of user@email.com. If not provided, left blank.
    4. homepage = commenter's webpage in the form of http://www.site.com. If not provided, left blank.
    5. weekday = of comment numeric (0-6, 0=Sunday, 1=Monday, etc.)
    6. month = of comment numeric (1-12, no leading zeroes)
    7. day = of comment numeric (1-31, no leading zeroes)
    8. year = of comment numeric (format: yyyy)
    9. hour = of comment numeric (1-12, no leading zeroes)
    10. minute = of comment numeric (1-12, no leading zeroes)
    11. second = of comment numeric (1-12, no leading zeroes)
    12. ampm = either AM or PM
    13. comment = text presented all on one line, with the same text replacement conventions as in the main entry text. (See Line 3.)

Banlist (gm-banlist.cgi)

    The banlist is simply a list of Internet Protocal addresses that are prevented from using the functionality of the GreyMatter software, such as posting comments to accessing the admin page. The banlist file is now order independent, meaning that there is no special order to the banlist information within. The order of the items isn't needed for any of the current functionality provided by GreyMatter.
  • One banned ip per line, each line containing the following values separated by the '|' character:
    1. ip = ip address of machine to ban (key of returned hash)
    2. host = the hostname of the banned ip (currently not used)
    3. label = an optional label to describe the banned ip
  • Naming. Authors don't follow as precise a pattern as configs and templates. Rather the author variable name attempts to be descriptive, while not providing redundant information.
  • Spam. We are at the mercy of the webserver to tell us the i.p. of the request (through the REMOTE_HOST environment variable. However, the webserver can be 'fooled' and given a bad i.p. (google: ip spoofing) also those users that use dialup or non-premium dsl/cable usually do not have a static i.p. (companies love to charge for this convenience). This means that banning by ip is usually ineffective to prevent spammers, but can usually be useful against nuisance users (most work places will be using static i.p.s) and if you notice a patter you could ban a range of i.p.s.

Log (gm-cplist.cgi)

    The cplog file is a listing of log entries entered by GreyMatter to keep the user informed of certain events. This file is order dependent with the first line being the oldest and the last being the most recent. Note that Gm_Storage functions treat the cplog information as an array, mostly to preserve order (also, because there isn't really a logical key besides an arbitrary id number or the date+time the line was logged and this would be a pain to sort (probably).
  • One log 'entry' per line, with the date and time of the log entry usually added, but not always. Resetting the log is simply a matter of clearing out the file.
  • Naming. Authors don't follow as precise a pattern as configs and templates. Rather the author variable name attempts to be descriptive, while not providing redundant information.
  • Spam. We are at the mercy of the webserver to tell us the i.p. of the request (through the REMOTE_HOST environment variable. However, the webserver can be 'fooled' and given a bad i.p. (google: ip spoofing) also those users that use dialup or non-premium dsl/cable usually do not have a static i.p. (companies love to charge for this convenience). This means that banning by ip is usually ineffective to prevent spammers, but can usually be useful against nuisance users (most work places will be using static i.p.s) and if you notice a patter you could ban a range of i.p.s.