Greymatter
Weblog/Journal Software . Version 1.6.1 . Software Developers Guide
Copyright (c) 2000-2006 The Greymatter Team . All Rights Reserved
These are documents outside of the code to help create a 'big picture'. Note
that code comments should superscede any documentation here. This document contains
file layout to document what the files hold (information-wise). Api for the way to get this information
should appear in the Perl Module. For example, look at the libs/Gm_Storage.pm file and you
will see each public method with documentation on what arguments it takes, what it returns,
how to use it, and what (if anything) it deprecates.
Note that I don't really care about style and format of the code,
rather, I care that there are comment, I care that the api is used and not worked around.
- Strict. Always, always, always 'use Strict' in any new module. Perl strict will
catch countless careless mistakes (we've all made them) and no brainers that its just
harder to develop without it.
- Warn. Use warns. Similar reasons as above, but it encourages better code
writting in that it reminds you to dot your i's and cross your t's.
- Constants. Try to use constants wherever you can, specifically, use values in
Gm_Constants. Several defects have been tracked down due to mistypings of constant
values/flags (for exapmple ' open' vs 'open' and 'templete' vs 'template').
- Handlers. When appropriatte take a 'handler' as a subroutine argument. A
handler is simply a reference to a subroutine. This handler can then be determined
by the calling subroutine (for example, if a user page (such as comments.cgi)
calls a function, any errors
should show the User Template, not the Admin look and feel).
- Newlines. Unless printing directly to the screen, avoid newlines.
Subroutines that save data should handle newlines if necescary. For example,
Gm_Store::addLogMessage adds a newline because its working on a flat file. But
if it was stored a database, it wouldn't make sense to have the newline at the
end of the control panel message. So when addLogMessage is called, it takes data
that is 'peristant storage neutral'. Let the Storage subroutines worry about
newlines and flatfile storge. When we switch to a database, then we won't have
to hunt through the code getting rid of newline characters.
- Printing. Return a string rather than printing, this is just more elegant. Leave prints
to the calling subroutine if possible.
- Encapsulation. Private subroutines should start with the '_' character. By private I mean
it will never be called outside of this package.
- Arguments. If you have more than 1 or 2 parameters, especially if they are not required,
use named parameters such as in createRadioButton. By putting stuff in a
hash we gain the flexibility to add more optional parameters without having
to pass in '' placeholders or modify existing code.
- Quotes. Use ' and " where appropriate. If you don't have any variables or newlines
then use ', its quicker and cleaner.
- Strings. Don't put all text on one huge long line. It messes with some programs
that don't do line wraps well (some cvs, some text editors, etc.). Its also makes
reading through the code difficult since people have to scroll down AND over.
- Readability. Make it easy to read for a human, so break things up onto multiple lines
if the line of code scrolls off screen. This also applies to perl'isms such as when in a subroutine
calling 'shift' to get the arguments passed in, to make it more readable, explicitly say 'shift(@_)'.
Perl has many such hidden variables, always use the name and explicity state them.
This is a road map of what I would like to do with the code and
where I am moving stuff to (i.e. the method to my madness).
The files
While having everything in gm-library.cgi has its advantages in that
its easy to upgrade and install, the tradeoffs are too great. I was amazed that the
file has 12,000+ lines! Such a file is hard to navigate it harder to see what might
be duplicating what. By breaking this file into sections and refactoring the code
to use scoping and encapsulation, I think alot of benefits will be acheived.
The libs directory will be the new repository of the code, with no configuration or
data persitance in the libs, just the library files (perl modules).
The library Gm_Core will contain all of the essential functions of
GreyMatter such as applying the templates, formatting an entry, etc. Gm_Web will
contain methods that relate primarily with communicating with the user, such as
displaying screens and forms. Gm_Utils will contain the small miscellanous funtions
that will streamline other code, such as checking for hacks in a string and formatting
the date.
Gm_Storage is the location of all data persistance mechanisms, by
which I mean flat file reading or if it was written, database access. The idea is
that the rest of GreyMatter doesn't care _how_ the data is stored, as long as its
returned in a consistant manner. Gm_Upgrade focuses on upgrading from one version
to another, without cluttering up the other code, particularly Gm_Storage since
upgrading does need to know how the data is stored. Gm_Constants is just a file
of contants that are used to avoid careless mistakes such as 'Yes' vs 'yes', or
testing for an empty string '', but actually looking for a string with a space ' '.
Static vs Dynamic
As noted with Gm_Storage I am trying to remove GreyMatter's dependancy on flat-files
so that if someone wants to use a database, its at least possible if the Gm_Storage
library is rewritten. However, GreyMatter also generates static content through
rebuilds, comments, etc. For now, I think this is an advantage in that its quicker
and lighter (server loads) that dynamically generating each page. The current
GreyMatter is even flexible enough so that PHP pages could be generated instead
of HTML, through the configs and templates.
Admin Issues
Until I learned more about css, I did everything with tables for appearence in pages
such as doing borders and such, but with css, we not only get more flexibility but
the html is also easier to read when working on the code. The current admin screens
are very table heavy and I would like to redo them to keep the same style but by using
div tags. At the same time I would also like to redo the navigation of the admin screen
so that there is less clicking around. I am planning a constant menu on every page that would
allow the user to at least jump around to the major sections. And lastly, a help feature
would be useful. Simply having a question icon next to words and fields, that would link
to the gm_manual (popup perhaps?).
The template file is now order independant, meaning that there is no special
order to the templates stored within. This is because the format of the template is
now:
template_name=template_value
One template per line, each line begining with the name of the template.
- Naming. Template names should always end with the word 'template', as this will make it
obvious that its a template and easy to pull out of a hash (form submission
for example will contain templates and other values).
- Todo: List template variables and what they are used for...
The config file is now order independant, meaning that there is no special
order to the configuration variables stored within. This is because the format of the config is
now:
config_name=config_value
One config per line, each line begining with the name of the config.
- Naming. Config names should always start with the word 'gm', as this will make it
obvious that its a GreyMatter config and easy to pull out of a hash (form submission
for example will contain configs and other values).
- Todo: List config variables and what they are used for...
The counter file is now order independant, meaning that there is no special
order to the counter values stored within. This is because the format of the counter is
now:
counter_name=counter_value
One counter per line, each line begining with the name of the counter variable.
- Naming. Counter variables don't follow as precise a pattern as configs and templates.
Rather the counter name attempts to be descriptive, while not providing redundant information.
- The counter variables and their meaning:
entrytotal = total number of entries posted
archivetotal = total number of entries not on front page
stayattopentry = entry# designated as "Stay At Top", marked 0 if there's no such thing
karmapos = total positive karma votes
karmaneg = total negative karma votes
commenttotal = total comments posted
opentotal = total number of open entries
closedtotal = total number of closed entries
The entrylist file is now order independant, meaning that there is no special
order to the entrylist lines within. The order of the items isn't needed for any
of the current functionality provided by GreyMatter and here is an example of sorting
by entry number:
my $gmentrylist = Gm_Storage::getEntrylist( errHandler=>\&Gm_Web::displayAdminErrorExit );
foreach my $entry ( sort { $gmentrylist->{$b}{'id'} <=> $gmentrylist->{$a}{'id'} } keys( %$gmentrylist ) ) {
...
- One counter per line, each line containing the following values seperated by the '|' character:
- id = the numerical id of the entry (key of returned hash)(usually order in which entered, never 0)
- author = entry author's name (must be alphanumeric)
- subject = entry subject (must be alphanumeric)
- created = CREATE Date of entry in the format of mm/dd/yy (does include leading zeros)
- createt = CREATE Time of entry in the form of hh:mm [AM/PM] (does include leading zeros)
- status = entry status: open/closed, either O or C
- extended = is this an extended entry, either Y or N
- music = current music of entry (well author really)
- mood = current mood of entry (well author really)
- emoticons = are emoticons enabled, yes or no
- Naming. Counter variables don't follow as precise a pattern as configs and templates.
Rather the counter name attempts to be descriptive, while not providing redundant information.
The authors file is now order independant, meaning that there is no special
order to the author information within. The order of the items isn't needed for any
of the current functionality provided by GreyMatter and here is an example of sorting
by author name alphabetically:
my $gmauthors = Gm_Storage::getAuthors( errHandler=>\&Gm_Web::displayAdminErrorExit );
foreach my $author ( sort { $gmauthors->{$a}{'author'} cmp $gmauthors->{$b}{'author'} } keys( %$gmauthors ) ) {
...
- One author per line, each line containing the following values seperated by the '|' character:
- author = author's name (key of returned hash (case sensitive))
- password = author's password (crypted)
- email = author's email
- homepage = author's homepage
- created = CREATE Date of the author
- posttotal = total number of postings by this author
- postnew = can this author make new posts Y or N
- editentries = can this author edit entries Y or N
- editconfigs = can this author edit configs Y or N
- edittemplates = can this author edit templates Y or N
- editauthors = can this author edit other authors Y or N
- rebuild = can this author rebuild files Y or N
- viewcplog = can this author view the control panel Y or N
- bookmarklets = can this author use bookmarklets Y or N
- upload = can this author upload files Y or N
- viewadmin = can this author access the admin scene (gm.cgi) Y or N
- Naming. Authors don't follow as precise a pattern as configs and templates.
Rather the author variable name attempts to be descriptive, while not providing redundant information.
The entry file remains unchanged. Note that the entry information is contained
withen the cgi file and it is generated to the html files (by default, the file type
can be different than cgi). This information was culled from the old
greymatterforums site, originally contributed by Flipped Cracker (Robert).
The layout of the file is much more complex then the other files. The first 4 or 5
lines give the majority of the information about a particular entry with commments
appearing after the 4th line:
- line, information about the post/entry
- line, Karma-related information. The IP addresses and the votes associated
with those IP addresses are collated here.
- line, the "main text" of each entry.
- line, the "extended text" of each entry. If none, a blank line is left.
- line (and more if necessary), comments. One comment per line.
- The entry information is stored in the first line, with the following information seperate
by the '|' character:
- author = author's name (key of returned hash (case sensitive))
- id = numeric, never 0
- author = alphanumeric
- subject = title of post alphanumeric (we hope)
- weekday = numeric (0-6, 0=Sunday, 1=Monday, etc.)
- month = month of post numeric (1-12, no leading zeroes)
- day = day of post numeric (1-31, no leading zeroes)
- year = year of post numeric (format: yyyy)
- hour = hour of post numeric (1-12, no leading zeroes)
- minute = minute of post numeric (1-12, no leading zeroes)
- second = second of post numeric (1-12, no leading zeroes)
- ampm = either AM or PM
- karmapos = positive karma numeric
- karmaneg = negative karma numeric
- commenttotal = number of comments numeric; 0 if no comments
- karma = votes allowed yes/no
- comments = comments allowed yes/no
- status = entry open or closed open/closed
- music = current music of entry (well author really)
- mood = current mood of entry (well author really)
- emoticons = are emoticons enabled, yes or no
- The karma votes is stored in the second line, with the following information seperate
by the '|' character (note that each entry has the default line '0.0.0.0|I'):
- ip = the ip that cast the karma vote
- vote = the karma vote, either a P for positive or N for negative
- The main text of the entry is stored in the third line, with single line breaks
replaced with '|*|' and double line breaks with '|*||*|'. All the text is presented
as one line.
- The extended text of the entry is stored in the fourth line, following the
same text conventions of line 3.
- The comments are stored in the fifth line and beyond, with the following information seperate
by the '|' character:
- name = commenter's name alphanumeric
- ip = in the form of xxx.xxx.xxx.xxx
- email = in the form of user@email.com. If not provided, left blank.
- homepage = commenter's webpage in the form of http://www.site.com. If not provided, left blank.
- weekday = of comment numeric (0-6, 0=Sunday, 1=Monday, etc.)
- month = of comment numeric (1-12, no leading zeroes)
- day = of comment numeric (1-31, no leading zeroes)
- year = of comment numeric (format: yyyy)
- hour = of comment numeric (1-12, no leading zeroes)
- minute = of comment numeric (1-12, no leading zeroes)
- second = of comment numeric (1-12, no leading zeroes)
- ampm = either AM or PM
- comment = text presented all on one line, with the same text replacement
conventions as in the main entry text. (See Line 3.)
The banlist is simply a list of Internet Protocal addresses that are prevented from
using the functionality of th egreymatter software, such as posting comments to
accessing the admin page. The banlist file is now order independant, meaning that there is no special
order to the banlist information within. The order of the items isn't needed for any
of the current functionality provided by GreyMatter.
- One banned ip per line, each line containing the following values seperated by the '|' character:
- ip = ip address of machine to ban (key of returned hash)
- host = the hostname of the banned ip (currently not used)
- label = an optional label to describe the banned ip
- Naming. Authors don't follow as precise a pattern as configs and templates.
Rather the author variable name attempts to be descriptive, while not providing redundant information.
- Spam. We are at the mercy of the webserver to tell us the i.p. of the request (through the REMOTE_HOST
environment variable. However, the webserver can be 'fooled' and given a bad i.p. (google: ip spoofing) also
those users that use dialup or non-premium dsl/cable usually do not have a static i.p. (companies love to
charge for this convienence). This means that banning by ip is usually ineffective to prevent spammers, but
can usually be usefull against nuisance users (most work places will be using static i.p.s) and if you notice a
patter you could ban a range of i.p.s.
The cplog file is a listing of log entries entered by GreyMatter to keep the user
informed of certain events. This file is order dependant with the first line being the oldest and the
last being the most recent. Note that Gm_Storage functions treat the cplog information as an
array, mostly to preserve order (also, because there isn't really a logical key besides
an arbitrary id number or the date+time the line was logged and this would be a pain to
sort (probably).
- One log 'entry' per line, with the date and time of the log entry usually added, but not
always. Resetting the log is simply a matter of clearing out the file.
- Naming. Authors don't follow as precise a pattern as configs and templates.
Rather the author variable name attempts to be descriptive, while not providing redundant information.
- Spam. We are at the mercy of the webserver to tell us the i.p. of the request (through the REMOTE_HOST
environment variable. However, the webserver can be 'fooled' and given a bad i.p. (google: ip spoofing) also
those users that use dialup or non-premium dsl/cable usually do not have a static i.p. (companies love to
charge for this convienence). This means that banning by ip is usually ineffective to prevent spammers, but
can usually be usefull against nuisance users (most work places will be using static i.p.s) and if you notice a
patter you could ban a range of i.p.s.
|