SectionTopics

From SoylentNews
Jump to: navigation, search

CssWork parent

NAME
    sectiontopics - Information about the section-topics rewrite, June 2004

DESCRIPTION
    In June 2004, the way Slash handles categorization of stories into
    topics was changed. The confusing relationship between topics, sections,
    and subsections, which limited choices, was bolted on to the old system
    as our needs evolved.

    The "section-topics" rewrite, as it's being called, took a look at those
    needs with the perspective of hindsight, and developed a new system
    which will be less confusing in the long run. It provides the
    flexibility that large sites' admins will need, while still retaining
    ease of use for smaller sites' admins.

    This document attempts to explain the changes for Slash administrators,
    and it may be informative reading for Slash authors and editors too.

CONCEPTS
    I'm not going to bother explaining exactly how the old system worked,
    because it's dead now. Suffice it to say that every story had one
    section and one or more topics, those being two distinct and disjoint
    categorizations. Subsections were invented to allow finer-grained
    control than what sections permitted, and there were ways to constrain
    certain topics to only being available in certain sections.

    The "section" data type was always confused about whether it wanted to
    be a categorization and descriptor, or a visual display modifier. In
    other words, regarding an object that it was assigned to like a story,
    "section" implied facts about both its data and how it should be viewed.
    Being in the "Book Reviews" section on Slashdot, for example, implied
    that additional data like ISBN would be stored along with the rest of
    the story's data. Being in the "Developers" section meant that it would
    be grouped with other developer-related stories and that it would appear
    blue instead of green. If a story was a book review for developers,
    there was no way to put it in both places; having that blue color meant
    that no ISBN data could be stored.

    So "section" has been split into two: "skin" and "nexus". *Most* of the
    information that went with a section was used to describe appearances,
    and that went over to skin. So a skin now controls color (through the
    skin_colors table), it controls which templates are used (the final part
    of a template's three-part name is now skin, not section), and it
    controls with which other stories a story is grouped (on which index
    page). And the non-display aspects of sections -- mainly, the
    "section_extras" data which ensured that stories in Book Reviews stored
    a field for ISBN -- were sent over to nexuses.

    Each skin has precisely one nexus; you can think of a skin as drawing
    its stories from its nexus. The clever part is that a nexus is just a
    special kind of topic (which we call a topic_nexus when we want to
    emphasize that it is both). So if a story has both the Developers
    topic_nexus, and the Book Reviews topic_nexus, then it will appear on
    both books.slashdot.org and developers.slashdot.org. And the additional
    data stored with the story will include the union of all the "extras"
    data -- not only ISBN and so on, but also any "extras" data that may be
    in the Developers nexus. There don't actually happen to be any extras
    for Developers on Slashdot, so maybe this isn't the best example, but if
    there were, a story that was categorized into both nexuses would include
    that data too.

    Authors are no longer restricted from choosing any topic with any story.
    Since a nexus is a topic like any other, an author who wishes to make
    sure a story shows up in the Apple skin can pick the Apple nexus
    specifically. But what's more likely to happen is that an author will
    pick topics that make sense for the story (like "Mac OS X") and that the
    weights assigned to those topics will propagate upwards into the correct
    nexus(es) where the story should appear.

    Along the way, stories were given a numeric primary key (stoid), as were
    skins. Don't worry, a story's sid still works just as it did before; no
    Slash URLs are required to change, and in particular all the URLs (for
    search.pl and index.pl) that had "section=" as a parameter still do.

WEIGHTS
    Each topic picked for a story now must have a weight assigned to it.
    Weights are floating-point non-negative numbers. The templates shipped
    with the stock theme assume that authors will be assigning only the
    weights 0, 10, 20, 30, 40, or 50, but the only weight value that the
    core code treats specially is 0 (which means "ignore this topic"). How
    positive weights affect the categorization of stories depends entirely
    on the topic_parents table.

    Even in the old system, topics could be arranged into a tree -- but to
    properly represent section-specific topics one would have to layer
    additional data on top of the tree, confusing matters somewhat. Now,
    topics (including nexuses) really do form a tree, which is to say a
    directed graph. All topic-related data is loaded into a $slashd object
    at once, when getTopicTree is called.

    One key difference between the old system and the new is that,
    previously, a topic had at most one parent. Now, a topic may have zero
    or more parents. This allows a topic of "Darwin" to be a child of both
    "Apple" and "BSD," which conceptually means that it is a subcategory of
    both, and which practically means that assigning "Darwin" a weight will
    allow that weight to propagate up to both.

RENDERING
    This process of weight-propagation occurs when chosen topics are
    rendered. Each parent-child relationship from one topic to another
    includes a minimum weight. For any given story, if topic T1 is assigned
    weight W, and topic T2 is the parent of T1 with min_weight M, then T2
    will also be assigned weight W for the story if, and only if, M <= W.

    That assignment continues recursively (to topic T3, and so on) in a
    process called "rendering" -- performed by renderTopics(). A story
    author stores his or her topic/weight duples in the story_topics_chosen
    table, and at story save time, these choices are rendered into a
    (probably larger) collection of topic/weight duples that are stored in
    the story_topics_rendered table.

    (The above rule describes most of what is involved in the rendering
    process. The other rules in the algorithm are that if multiple children
    of differing weights both propagate up to the same parent, the greater
    of those weights become the parent's; and that any topic's chosen
    weight, including a weight of 0, always overrides any weight propagating
    up from its children.)

    Finally, when the collection of rendered topic/weight duples has been
    fully formed, all topics with weight 0 are dropped. Weight 0 can exist
    in chosen topics, but never in rendered topics.

A SIMPLE EXAMPLE
    That may sound a bit complicated, so here's a description using the
    default topics and topic_parents included with the default "slashcode"
    theme:

    tid 1: mainpage (also a nexus)
    tid 3: opensource (also a nexus)
    tid 4: slash
    tid 7: linux

    Tid 4 has tid 3 as a parent, with that relationship having a min_weight
    of 10 associated with it.

    Tid 7 also have tid 3 as a parent, with min_weight 10.

    Tid 3 has tid 1 as a parent, with min_weight 30.

    The skin at "http://example.com/" reads the nexus tid 1; the skin at
    "http://opensource.example.com/" reads the nexus tid 3.

    Suppose an editor is working on a story about Slash and assigns it the
    topic "slash," tid 4, with weight 10. Weight 10 is described as
    "Sectional only" in the code. When that story is saved, renderTopics
    recursively propagates the weight of its topics, or in this case its
    single topic, up to parents, or in this case parent. Rendering adds tid
    3, also at weight 10. It does not add tid 1 since !(10 >= 30). The story
    thus will appear only on "http://opensource.example.com/" and not
    "http://example.com/".

    Now suppose the editor re-edits the story, adding important information
    about Linux. He or she at that point adds the Linux tid 7 with a weight
    of 30. Now when it is saved, tid 1 is added since 30 >= 30. Now the
    story will appear at both URLs.

    Now suppose the editor is instructed that this story must be removed
    from "http://opensource.example.com/", though it should stay on the main
    page "http://example.com/". In the admin.pl editor, he or she adds tid 3
    with weight 0. That by itself would remove both nexuses when the story
    saves, since the 0 would prohibit both child tids from propagating
    higher than tid 3 up to tid 1. Upon clicking Preview, the admin sees
    that "This story will not appear" (see admin.pl
    getDescForTopicsRendered()). So the admin also adds tid 1 -- any weight
    greater than 0 would do, but weight of 30 makes the most sense since the
    backend describes that as "Mainpageworthy." Once this story is saved,
    its rows in story_topics_chosen are:

            tid 1, weight 30
            tid 3, weight  0
            tid 4, weight 10
            tid 7, weight 10

    and its rows in story_topics_rendered are:

            tid 1, weight 30
            tid 4, weight 10
            tid 7, weight 10

    Note that, as far as almost all of the code is concerned, the weight
    value in story_topics_rendered is irrelevant; only whether a row exists
    or not is noted. (This is why weight of 0 never appears in that table.)

DISPLAY OPTIONS AND WEIGHTS
    So how are these values of weights 10 and 30 decided, and what are 20
    and 50 for?

    Previously in Slash, there were three possible values for a story's
    displaystatus: Never Display, Section-Only, and Always Display.
    Section-Only meant to only display a story in its section's homepage,
    not the site's main page, and Always Display meant to display a story
    both places.

    Now that a story may be part of more than one skin (the new term for
    "section"), that distinction is not so simple. While the method
    _displaystatus() will return an old-style displaystatus value for a
    story, this is for reverse compatibility and is deprecated. The proper
    question now takes two arguments instead of one: is a story to be
    displayed _in_ a particular skin.

    The answer to that question is very simple; if a row exists in
    story_topics_rendered with the story's stoid and the topic's tid, then
    yes; otherwise, no.

    To prevent everything from breaking at once, and to keep the backend
    story list looking much the same as it did before (white background for
    Always Display, light gray for Section-Only, dark gray for Never
    Display), the "mainpage skin" was created. Defined by the var
    mainpage_skid [sic, a skid is a skin's numeric primary key], this
    defines which skin a story must be in to be considered "Always Display."
    It also defines which topic nexus will be colored blue instead of yellow
    in admin.pl?op=topictree (you will need GraphViz installed to see this;
    see plugins/Admin/README). Nevertheless, Slash is now well-equipped to
    run a website which consists of many subsites, at different URLs,
    perhaps only loosely networked and not necessarily with one central
    "main" page.

    multiple skins and how index.pl uses stories.primaryskid

    skins.cookiedomain and the cookiedomain var

    the Topiclist

    the topic chooser

    no admin.pl interface to edit topic tree yet, but op=topictree (and
    GraphViz, see plugins/Admin/README)

    suggestions for a clean topic tree (use min_weight 10 to connect
    many/most topics to logical categorizations, which could/should be
    nexuses, then connect those to mainpage with min_weight 30, finally
    bring loose topics to mainpage with min_weight 30)

    utils/convertDBto200406

    and _suggest and how it can be used to advise on a better topic tree

    and _render which needs to be run

VERSION
    $Id$

[root@slashcode docs]#  
[root@slashcode docs]# ls
boilerplates          HOWTO-Themes.html   sectiontopics.txt  slashstyle.txt
dopods.plx            HOWTO-Themes.pod    slasherd.fig       slashtables.html
formkeys.txt          HOWTO-Themes.txt    slasherd.pdf       slashtables.pod
HOWTO-Plugins.html    INSTALL.pod         slasherd.ps        slashtables.txt
HOWTO-Plugins.pod     INSTALL.txt         slashguide.html    slashtags.html
HOWTO-Plugins.txt     README.pod          slashguide.pod     slashtags.pod
HOWTO-Templates.html  README.txt          slashguide.txt     slashtags.txt
HOWTO-Templates.pod   sectiontopics.html  slashstyle.html
HOWTO-Templates.txt   sectiontopics.pod   slashstyle.pod
[root@slashcode docs]# more sectiontopics.txt 
NAME
    sectiontopics - Information about the section-topics rewrite, June 2004

DESCRIPTION
    In June 2004, the way Slash handles categorization of stories into
    topics was changed. The confusing relationship between topics, sections,
    and subsections, which limited choices, was bolted on to the old system
    as our needs evolved.

    The "section-topics" rewrite, as it's being called, took a look at those
    needs with the perspective of hindsight, and developed a new system
    which will be less confusing in the long run. It provides the
    flexibility that large sites' admins will need, while still retaining
    ease of use for smaller sites' admins.

    This document attempts to explain the changes for Slash administrators,
    and it may be informative reading for Slash authors and editors too.

CONCEPTS
    I'm not going to bother explaining exactly how the old system worked,
    because it's dead now. Suffice it to say that every story had one
    section and one or more topics, those being two distinct and disjoint
    categorizations. Subsections were invented to allow finer-grained
    control than what sections permitted, and there were ways to constrain
    certain topics to only being available in certain sections.

    The "section" data type was always confused about whether it wanted to
    be a categorization and descriptor, or a visual display modifier. In
    other words, regarding an object that it was assigned to like a story,
    "section" implied facts about both its data and how it should be viewed.
    Being in the "Book Reviews" section on Slashdot, for example, implied
    that additional data like ISBN would be stored along with the rest of
    the story's data. Being in the "Developers" section meant that it would
    be grouped with other developer-related stories and that it would appear
    blue instead of green. If a story was a book review for developers,
    there was no way to put it in both places; having that blue color meant
    that no ISBN data could be stored.

    So "section" has been split into two: "skin" and "nexus". *Most* of the
    information that went with a section was used to describe appearances,
    and that went over to skin. So a skin now controls color (through the
    skin_colors table), it controls which templates are used (the final part
    of a template's three-part name is now skin, not section), and it
    controls with which other stories a story is grouped (on which index
    page). And the non-display aspects of sections -- mainly, the
    "section_extras" data which ensured that stories in Book Reviews stored
    a field for ISBN -- were sent over to nexuses.

    Each skin has precisely one nexus; you can think of a skin as drawing
    its stories from its nexus. The clever part is that a nexus is just a
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# 
[root@slashcode docs]# cat sectiontopics.txt 
NAME
    sectiontopics - Information about the section-topics rewrite, June 2004

DESCRIPTION
    In June 2004, the way Slash handles categorization of stories into
    topics was changed. The confusing relationship between topics, sections,
    and subsections, which limited choices, was bolted on to the old system
    as our needs evolved.

    The "section-topics" rewrite, as it's being called, took a look at those
    needs with the perspective of hindsight, and developed a new system
    which will be less confusing in the long run. It provides the
    flexibility that large sites' admins will need, while still retaining
    ease of use for smaller sites' admins.

    This document attempts to explain the changes for Slash administrators,
    and it may be informative reading for Slash authors and editors too.

CONCEPTS
    I'm not going to bother explaining exactly how the old system worked,
    because it's dead now. Suffice it to say that every story had one
    section and one or more topics, those being two distinct and disjoint
    categorizations. Subsections were invented to allow finer-grained
    control than what sections permitted, and there were ways to constrain
    certain topics to only being available in certain sections.

    The "section" data type was always confused about whether it wanted to
    be a categorization and descriptor, or a visual display modifier. In
    other words, regarding an object that it was assigned to like a story,
    "section" implied facts about both its data and how it should be viewed.
    Being in the "Book Reviews" section on Slashdot, for example, implied
    that additional data like ISBN would be stored along with the rest of
    the story's data. Being in the "Developers" section meant that it would
    be grouped with other developer-related stories and that it would appear
    blue instead of green. If a story was a book review for developers,
    there was no way to put it in both places; having that blue color meant
    that no ISBN data could be stored.

    So "section" has been split into two: "skin" and "nexus". *Most* of the
    information that went with a section was used to describe appearances,
    and that went over to skin. So a skin now controls color (through the
    skin_colors table), it controls which templates are used (the final part
    of a template's three-part name is now skin, not section), and it
    controls with which other stories a story is grouped (on which index
    page). And the non-display aspects of sections -- mainly, the
    "section_extras" data which ensured that stories in Book Reviews stored
    a field for ISBN -- were sent over to nexuses.

    Each skin has precisely one nexus; you can think of a skin as drawing
    its stories from its nexus. The clever part is that a nexus is just a
    special kind of topic (which we call a topic_nexus when we want to
    emphasize that it is both). So if a story has both the Developers
    topic_nexus, and the Book Reviews topic_nexus, then it will appear on
    both books.slashdot.org and developers.slashdot.org. And the additional
    data stored with the story will include the union of all the "extras"
    data -- not only ISBN and so on, but also any "extras" data that may be
    in the Developers nexus. There don't actually happen to be any extras
    for Developers on Slashdot, so maybe this isn't the best example, but if
    there were, a story that was categorized into both nexuses would include
    that data too.

    Authors are no longer restricted from choosing any topic with any story.
    Since a nexus is a topic like any other, an author who wishes to make
    sure a story shows up in the Apple skin can pick the Apple nexus
    specifically. But what's more likely to happen is that an author will
    pick topics that make sense for the story (like "Mac OS X") and that the
    weights assigned to those topics will propagate upwards into the correct
    nexus(es) where the story should appear.

    Along the way, stories were given a numeric primary key (stoid), as were
    skins. Don't worry, a story's sid still works just as it did before; no
    Slash URLs are required to change, and in particular all the URLs (for
    search.pl and index.pl) that had "section=" as a parameter still do.

WEIGHTS
    Each topic picked for a story now must have a weight assigned to it.
    Weights are floating-point non-negative numbers. The templates shipped
    with the stock theme assume that authors will be assigning only the
    weights 0, 10, 20, 30, 40, or 50, but the only weight value that the
    core code treats specially is 0 (which means "ignore this topic"). How
    positive weights affect the categorization of stories depends entirely
    on the topic_parents table.

    Even in the old system, topics could be arranged into a tree -- but to
    properly represent section-specific topics one would have to layer
    additional data on top of the tree, confusing matters somewhat. Now,
    topics (including nexuses) really do form a tree, which is to say a
    directed graph. All topic-related data is loaded into a $slashd object
    at once, when getTopicTree is called.

    One key difference between the old system and the new is that,
    previously, a topic had at most one parent. Now, a topic may have zero
    or more parents. This allows a topic of "Darwin" to be a child of both
    "Apple" and "BSD," which conceptually means that it is a subcategory of
    both, and which practically means that assigning "Darwin" a weight will
    allow that weight to propagate up to both.

RENDERING
    This process of weight-propagation occurs when chosen topics are
    rendered. Each parent-child relationship from one topic to another
    includes a minimum weight. For any given story, if topic T1 is assigned
    weight W, and topic T2 is the parent of T1 with min_weight M, then T2
    will also be assigned weight W for the story if, and only if, M <= W.

    That assignment continues recursively (to topic T3, and so on) in a
    process called "rendering" -- performed by renderTopics(). A story
    author stores his or her topic/weight duples in the story_topics_chosen
    table, and at story save time, these choices are rendered into a
    (probably larger) collection of topic/weight duples that are stored in
    the story_topics_rendered table.

    (The above rule describes most of what is involved in the rendering
    process. The other rules in the algorithm are that if multiple children
    of differing weights both propagate up to the same parent, the greater
    of those weights become the parent's; and that any topic's chosen
    weight, including a weight of 0, always overrides any weight propagating
    up from its children.)

    Finally, when the collection of rendered topic/weight duples has been
    fully formed, all topics with weight 0 are dropped. Weight 0 can exist
    in chosen topics, but never in rendered topics.

A SIMPLE EXAMPLE
    That may sound a bit complicated, so here's a description using the
    default topics and topic_parents included with the default "slashcode"
    theme:

    tid 1: mainpage (also a nexus)
    tid 3: opensource (also a nexus)
    tid 4: slash
    tid 7: linux

    Tid 4 has tid 3 as a parent, with that relationship having a min_weight
    of 10 associated with it.

    Tid 7 also have tid 3 as a parent, with min_weight 10.

    Tid 3 has tid 1 as a parent, with min_weight 30.

    The skin at "http://example.com/" reads the nexus tid 1; the skin at
    "http://opensource.example.com/" reads the nexus tid 3.

    Suppose an editor is working on a story about Slash and assigns it the
    topic "slash," tid 4, with weight 10. Weight 10 is described as
    "Sectional only" in the code. When that story is saved, renderTopics
    recursively propagates the weight of its topics, or in this case its
    single topic, up to parents, or in this case parent. Rendering adds tid
    3, also at weight 10. It does not add tid 1 since !(10 >= 30). The story
    thus will appear only on "http://opensource.example.com/" and not
    "http://example.com/".

    Now suppose the editor re-edits the story, adding important information
    about Linux. He or she at that point adds the Linux tid 7 with a weight
    of 30. Now when it is saved, tid 1 is added since 30 >= 30. Now the
    story will appear at both URLs.

    Now suppose the editor is instructed that this story must be removed
    from "http://opensource.example.com/", though it should stay on the main
    page "http://example.com/". In the admin.pl editor, he or she adds tid 3
    with weight 0. That by itself would remove both nexuses when the story
    saves, since the 0 would prohibit both child tids from propagating
    higher than tid 3 up to tid 1. Upon clicking Preview, the admin sees
    that "This story will not appear" (see admin.pl
    getDescForTopicsRendered()). So the admin also adds tid 1 -- any weight
    greater than 0 would do, but weight of 30 makes the most sense since the
    backend describes that as "Mainpageworthy." Once this story is saved,
    its rows in story_topics_chosen are:

            tid 1, weight 30
            tid 3, weight  0
            tid 4, weight 10
            tid 7, weight 10

    and its rows in story_topics_rendered are:

            tid 1, weight 30
            tid 4, weight 10
            tid 7, weight 10

    Note that, as far as almost all of the code is concerned, the weight
    value in story_topics_rendered is irrelevant; only whether a row exists
    or not is noted. (This is why weight of 0 never appears in that table.)

DISPLAY OPTIONS AND WEIGHTS
    So how are these values of weights 10 and 30 decided, and what are 20
    and 50 for?

    Previously in Slash, there were three possible values for a story's
    displaystatus: Never Display, Section-Only, and Always Display.
    Section-Only meant to only display a story in its section's homepage,
    not the site's main page, and Always Display meant to display a story
    both places.

    Now that a story may be part of more than one skin (the new term for
    "section"), that distinction is not so simple. While the method
    _displaystatus() will return an old-style displaystatus value for a
    story, this is for reverse compatibility and is deprecated. The proper
    question now takes two arguments instead of one: is a story to be
    displayed _in_ a particular skin.

    The answer to that question is very simple; if a row exists in
    story_topics_rendered with the story's stoid and the topic's tid, then
    yes; otherwise, no.

    To prevent everything from breaking at once, and to keep the backend
    story list looking much the same as it did before (white background for
    Always Display, light gray for Section-Only, dark gray for Never
    Display), the "mainpage skin" was created. Defined by the var
    mainpage_skid [sic, a skid is a skin's numeric primary key], this
    defines which skin a story must be in to be considered "Always Display."
    It also defines which topic nexus will be colored blue instead of yellow
    in admin.pl?op=topictree (you will need GraphViz installed to see this;
    see plugins/Admin/README). Nevertheless, Slash is now well-equipped to
    run a website which consists of many subsites, at different URLs,
    perhaps only loosely networked and not necessarily with one central
    "main" page.

    multiple skins and how index.pl uses stories.primaryskid

    skins.cookiedomain and the cookiedomain var

    the Topiclist

    the topic chooser

    no admin.pl interface to edit topic tree yet, but op=topictree (and
    GraphViz, see plugins/Admin/README)

    suggestions for a clean topic tree (use min_weight 10 to connect
    many/most topics to logical categorizations, which could/should be
    nexuses, then connect those to mainpage with min_weight 30, finally
    bring loose topics to mainpage with min_weight 30)

    utils/convertDBto200406

    and _suggest and how it can be used to advise on a better topic tree

    and _render which needs to be run