"Where do I even start??"
A writeup of my bughunting process for new codebases, specifically DW. Posted largely for the dw-dev-training community, the rest of you can probably afford to skip it, unless you have ideas to add.
I've seen a couple of posts to dw-dev[-training] from people who want to help, but have no experience with a codebase on the scale of DW, let alone the extra fun of a web-based app. So this is a quick walkthrough of a recent bug. It's only my third bug for DW, and in a new area, so I went through all the "Now what?" steps myself. Hopefully so you don't have to :)
The bug was http://bugs.dwscoalition.org/show_bug.cgi?id=2843 - involving display of tags in inbox notifications. So the first thing to do is set up an environment where you can locate the "problem" (in scare quotes because this isn't necessarily a bug, per se, it's a feature request. But the procedure is largely the same). Set up everything you need to check out the problem and your work. In this case, create a community, start watching it, make a couple of posts with tags, and see what turns up in my inbox. (A side note, this is where it's handy to have 2 user accounts, because I don't think dw will send you notification of your own posts. Log in as user A, create a community, start tracking it. Log out, log back in as user B, join the community, post to it with a tagged entry. Log out, log back in as user a, check your inbox. Go to your dreamhack and kick the mail queue. Go check your inbox again.). So. I can see, now, that the page I'm visiting is http://www.tyggerjai.hack.dreamwidth.net/inbox/ .
Code in a dw install is mostly in 2 places - dw/cgi-bin and dw/htdocs. URLs on your dw install map pretty closely to the htdocs directory - the htdocs .bml files are the "scripts", if you like, that dw calls to generate your content. If you add .bml to the end of the page, and replace the server name with htdocs/, you should come up with the page you need to start with. So I'm looking for htdocs/inbox.bml:
Ooookay. Or maybe I'm not. Oh, hang on. This isn't like http://www.tyggerjai.hack.dreamwidth.net/update , which would be htdocs/update.bml. Inbox ends in a slash, which means it's a directory.
Gotcha. In the absence of further information, it's fair to assume that url that invokes a base directory will actually serve the index file from that directory. So let's start with index.bml.
This is where it helps if you have a dreamhack, so you can play along at home. Because I'm not going to post the whole of these files here (although they are open source ....). Anyway. dw/htdocs/inbox/index.bml has about 280 lines. I start at the beginning and wander down until I see something meaningful. This can take a while ....
To be honest, I think I skimmed this file all the way to the end on my first time through without anything really jumping out at me. So I went back to anything that had seemed even vaguely relevant the first time, and came up with:
How do I know that's important? Well, partly because I've done this before, and I'm the kind of guy who thinks it's clever to make the whoooole inbox an object that can render itself. Yeah baby. That's how we rOOll ! But mostly because nothing else actually *does* much. Uh, I mean, there's a lot of useful scaffolding and ancillary code *supporting* the actual inbox. But it looks like this:
Ok. Those are all hints, but they're really not core to the display of messages in an inbox. But wait: this is more interesting:
Rightio. So there's a $inbox, and it has items. It's not the actual code we want, because of *course* we separate data storage from presentation of navigation elements. But it's a clue. Everything else is fluff. When you show an inbox folder, you have a lot of fluff around it that will be the same regardless of the contents of the folder - navigation, your journal title, whatever it is. But that line right there goes to the actual inbox, and gets the actual notifications from it.
And then, to be quite honest, does precious little with it. The next 20 lines are fluff again. But you know there's an inbox, and you know there are items. So fire up your text editor's search function, and look for $inbox. Go past the error messages, the "check read". Soon you get to @all_items. That's more interesting, because that's meat. That's all the items we're going to display here. Don't worry about why it's different from $inbox->items. We don't care. Sure enough, straight after that is "LJ::Widget::InboxFolder->render" , which takes $inbox and @all_items in its arguments. That's pay dirt. We got there just by thinking about what the page does (prints out some messages surrounded by a journal page), and which bit we need to change (the bit that prints the messages, because we're adding to it).
Ok. So this is our next clue. LJ::Widget is going to be a perl Module. LJ uses a whole lot of modules from the LJ namespace, which then have other modules under them. We're looking (probably) for InboxFolder.pm. Back to the command line:
Bingo. (We're working, by the way, in the cgi-bin tree, not the cvs files. Don't touch the cvs-files :)
So, back to the editor. We know the function we need is called "render", and in perl, that will be declared as a sub, so we're specifically looking for "sub render". Of course, that first finds us "sub render_body". We can try "sub render ", or "sub render{", or "sub render(" or "sub render\n" , depending on house style, but it's probably easier to keep looking for "sub render" unless youcan take a guess atknow the house style.
At least, it would be easier if such a sub existed. In fact, the string "render" occurs twice in this file, and neither is what we want. From here, we have a couple of options.
1) I am not an Object, I am a human ... oh, wait.
LJ::Widget::InboxFolder is a module that inherits from a parent, LJ::Widget. So if InboxFolder doesn't have "sub render", maybe Widget does. Which in fact, it does. Widget->render sets up a lot of fluff and then calls $widget->render_body. So InboxFolder->render_body is the right place after all.
2) Remember that browser? Go back to it, and take a look at the source for the messages we set up. The inbox entries are wrapped inside a div tag - "div class="InboxItem_Content""
That's fairly distinctive, and probably only used in an inbox, for an inbox item. So, back to the command line:
So. It's a leetle bit cargo cult, because you haven't followed the logical chain of inheritance from Widget. You're not sure *why* you need InboxFolder, except that it has the right text. But it is undeniably the right text. Probably.
Ok. So we're pretty sure that the messages are rendered by InboxFolder.pm->render_body. We can test that, by adding our own text (I like "FISH", because it's easy to grep for and doesn't crop up much in emails. Remind me to tell you about the time I wrote a major Delicatessen complaints database, though ....) and rechecking the page. Then we can get stuck in. But we're pretty sure this is right. (Oh, here's another handy hint. The comments for InboxFolder->render_body say: '# folder: the view or subset of notification items to display'. You may consider that a giveaway. You may, however, also wish to note the comment in Widget that says : "# FIXME: don't really need all widgets now". Your call.)
So, same routine as before. We start at the very beginning (it's a very good place to start!), and skim the routine looking for anything useful. We do have a target - we want to insert tags above the links that say "Reply" and "Link". Unfortunately, "Reply" and "Link" don't appear in this code. That's not a bad sign - hopefully they're stripped out in a text file somewhere for easy changing into Russian, or French. So, back to eyeballing code.
La la checkboxes, la la mark for deletion la la PRINT OUT MESSAGES. Let's try that. La la title, lala bookmarks, lala $contents, lalal InboxItem_Controls. Huh. Ok, maybe $contents. So, my $contents = $inbox_item->as_html.
That's less useful than it could be, to be honest. What's an inbox_item? Searching for "my $inbox_item" suggests it's actually a member of the @nitems array. Well, yay. What? Oh look, we got passed $nitems in from our calling function. This is my head. This is my desk. This is the conjunction of my head and my desk. This is me deciding to try brute force and ignorance. To be honest, I got a bit lucky here.
produces a page of results. But one of them is "NotificationItem.pm":
NItem. Ahhh.
Huh. Or, of course, I could read the comments to render_body: "# items: list of notification items" The lesson you take away is up to you. Bear in mind, however, that the comment doesn't help much unless you already know there's a NotificationItem.pm. But putting the two together gets us NotificationItem.pm, sub as_html. Which looks like this:
Uh, I, what? One arg assignment, two lines of error handling, and then one line of code. Well, at least we know where to look next.
...
Yeah. I had nothing better to do tonight anyway, in case you were wondering. Sub _load isn't much more helpful. Unless you care about the poor singleton qids. I don't. So let's try the second part of event->content. The content bit. Seems like a fair bet.
NotificationItem, of course, has no sub content. That would be too easy! It has an event, that has content, clearly. So what's an event? And what else does it have? Looking a little lower, we see it also has content_summary. So what are we looking at here? Notification of an event, that has content. Huh. Notification, specifically, of a journal post (that's our event). Time for a little more brute force and ignorance....
That's a little long to be useful, but there's a pattern. Things in LJ/Event have content and content_summary. Ooooh. Getting warm.
Ahhh. So this is a list of events, about which we might be notified. The specific one we want is, of course, JournalNewEntry.
To cut a very long story slightly less long, that ends up being the function we want. It gets the html description of the event, wraps it in actions ("Reply" and "Link"), and then sends it back. All we have to do now is inject our tags between the html and the actions, and we're done. Which is nice, because it's 3 am....
I've seen a couple of posts to dw-dev[-training] from people who want to help, but have no experience with a codebase on the scale of DW, let alone the extra fun of a web-based app. So this is a quick walkthrough of a recent bug. It's only my third bug for DW, and in a new area, so I went through all the "Now what?" steps myself. Hopefully so you don't have to :)
The bug was http://bugs.dwscoalition.org/show_bug.cgi?id=2843 - involving display of tags in inbox notifications. So the first thing to do is set up an environment where you can locate the "problem" (in scare quotes because this isn't necessarily a bug, per se, it's a feature request. But the procedure is largely the same). Set up everything you need to check out the problem and your work. In this case, create a community, start watching it, make a couple of posts with tags, and see what turns up in my inbox. (A side note, this is where it's handy to have 2 user accounts, because I don't think dw will send you notification of your own posts. Log in as user A, create a community, start tracking it. Log out, log back in as user B, join the community, post to it with a tagged entry. Log out, log back in as user a, check your inbox. Go to your dreamhack and kick the mail queue. Go check your inbox again.). So. I can see, now, that the page I'm visiting is http://www.tyggerjai.hack.dreamwidth.net/inbox/ .
Code in a dw install is mostly in 2 places - dw/cgi-bin and dw/htdocs. URLs on your dw install map pretty closely to the htdocs directory - the htdocs .bml files are the "scripts", if you like, that dw calls to generate your content. If you add .bml to the end of the page, and replace the server name with htdocs/, you should come up with the page you need to start with. So I'm looking for htdocs/inbox.bml:
dh-tyggerjai@hack:~/dw$ ls htdocs/inbox.bml
ls: cannot access htdocs/inbox.bml: No such file or directory
Ooookay. Or maybe I'm not. Oh, hang on. This isn't like http://www.tyggerjai.hack.dreamwidth.net/update , which would be htdocs/update.bml. Inbox ends in a slash, which means it's a directory.
dh-tyggerjai@hack:~/dw$ ls htdocs/inbox/
compose.bml compose.bml.text index.bml index.bml.text markspam.bml
Gotcha. In the absence of further information, it's fair to assume that url that invokes a base directory will actually serve the index file from that directory. So let's start with index.bml.
This is where it helps if you have a dreamhack, so you can play along at home. Because I'm not going to post the whole of these files here (although they are open source ....). Anyway. dw/htdocs/inbox/index.bml has about 280 lines. I start at the beginning and wander down until I see something meaningful. This can take a while ....
To be honest, I think I skimmed this file all the way to the end on my first time through without anything really jumping out at me. So I went back to anything that had seemed even vaguely relevant the first time, and came up with:
$body .= LJ::Widget::InboxFolder->render(
How do I know that's important? Well, partly because I've done this before, and I'm the kind of guy who thinks it's clever to make the whoooole inbox an object that can render itself. Yeah baby. That's how we rOOll ! But mostly because nothing else actually *does* much. Uh, I mean, there's a lot of useful scaffolding and ancillary code *supporting* the actual inbox. But it looks like this:
"# Inbox Nav".
"# Allow bookmarking to work without Javascript
# or before JS events are bound"
"# go through each item and see if it's checked"
Ok. Those are all hints, but they're really not core to the display of messages in an inbox. But wait: this is more interesting:
"# get events sitting in inbox
my @notifications = $inbox->items;"
Rightio. So there's a $inbox, and it has items. It's not the actual code we want, because of *course* we separate data storage from presentation of navigation elements. But it's a clue. Everything else is fluff. When you show an inbox folder, you have a lot of fluff around it that will be the same regardless of the contents of the folder - navigation, your journal title, whatever it is. But that line right there goes to the actual inbox, and gets the actual notifications from it.
And then, to be quite honest, does precious little with it. The next 20 lines are fluff again. But you know there's an inbox, and you know there are items. So fire up your text editor's search function, and look for $inbox. Go past the error messages, the "check read". Soon you get to @all_items. That's more interesting, because that's meat. That's all the items we're going to display here. Don't worry about why it's different from $inbox->items. We don't care. Sure enough, straight after that is "LJ::Widget::InboxFolder->render" , which takes $inbox and @all_items in its arguments. That's pay dirt. We got there just by thinking about what the page does (prints out some messages surrounded by a journal page), and which bit we need to change (the bit that prints the messages, because we're adding to it).
Ok. So this is our next clue. LJ::Widget is going to be a perl Module. LJ uses a whole lot of modules from the LJ namespace, which then have other modules under them. We're looking (probably) for InboxFolder.pm. Back to the command line:
dh-tyggerjai@hack:~/dw$ find . -name "InboxFolder.pm"
./cvs/dw-free/cgi-bin/LJ/Widget/InboxFolder.pm
./cgi-bin/LJ/Widget/InboxFolder.pm
Bingo. (We're working, by the way, in the cgi-bin tree, not the cvs files. Don't touch the cvs-files :)
So, back to the editor. We know the function we need is called "render", and in perl, that will be declared as a sub, so we're specifically looking for "sub render". Of course, that first finds us "sub render_body". We can try "sub render ", or "sub render{", or "sub render(" or "sub render\n" , depending on house style, but it's probably easier to keep looking for "sub render" unless you
At least, it would be easier if such a sub existed. In fact, the string "render" occurs twice in this file, and neither is what we want. From here, we have a couple of options.
1) I am not an Object, I am a human ... oh, wait.
LJ::Widget::InboxFolder is a module that inherits from a parent, LJ::Widget. So if InboxFolder doesn't have "sub render", maybe Widget does. Which in fact, it does. Widget->render sets up a lot of fluff and then calls $widget->render_body. So InboxFolder->render_body is the right place after all.
2) Remember that browser? Go back to it, and take a look at the source for the messages we set up. The inbox entries are wrapped inside a div tag - "div class="InboxItem_Content""
That's fairly distinctive, and probably only used in an inbox, for an inbox item. So, back to the command line:
dh-tyggerjai@hack:~/dw$ rgrep "InboxItem_Content" cgi-bin/LJ/Widget/*
cgi-bin/LJ/Widget/InboxFolder.pm: <div class="InboxItem_Content" style="display: $display;">$contents</div>
So. It's a leetle bit cargo cult, because you haven't followed the logical chain of inheritance from Widget. You're not sure *why* you need InboxFolder, except that it has the right text. But it is undeniably the right text. Probably.
Ok. So we're pretty sure that the messages are rendered by InboxFolder.pm->render_body. We can test that, by adding our own text (I like "FISH", because it's easy to grep for and doesn't crop up much in emails. Remind me to tell you about the time I wrote a major Delicatessen complaints database, though ....) and rechecking the page. Then we can get stuck in. But we're pretty sure this is right. (Oh, here's another handy hint. The comments for InboxFolder->render_body say: '# folder: the view or subset of notification items to display'. You may consider that a giveaway. You may, however, also wish to note the comment in Widget that says : "# FIXME: don't really need all widgets now". Your call.)
So, same routine as before. We start at the very beginning (it's a very good place to start!), and skim the routine looking for anything useful. We do have a target - we want to insert tags above the links that say "Reply" and "Link". Unfortunately, "Reply" and "Link" don't appear in this code. That's not a bad sign - hopefully they're stripped out in a text file somewhere for easy changing into Russian, or French. So, back to eyeballing code.
La la checkboxes, la la mark for deletion la la PRINT OUT MESSAGES. Let's try that. La la title, lala bookmarks, lala $contents, lalal InboxItem_Controls. Huh. Ok, maybe $contents. So, my $contents = $inbox_item->as_html.
That's less useful than it could be, to be honest. What's an inbox_item? Searching for "my $inbox_item" suggests it's actually a member of the @nitems array. Well, yay. What? Oh look, we got passed $nitems in from our calling function. This is my head. This is my desk. This is the conjunction of my head and my desk. This is me deciding to try brute force and ignorance. To be honest, I got a bit lucky here.
dh-tyggerjai@hack:~/dw$ rgrep "sub as_html " cgi-bin/*
produces a page of results. But one of them is "NotificationItem.pm":
cgi-bin/LJ/NotificationItem.pm:sub as_html {
cgi-bin/LJ/Widget.pm:sub as_html {
NItem. Ahhh.
Huh. Or, of course, I could read the comments to render_body: "# items: list of notification items" The lesson you take away is up to you. Bear in mind, however, that the comment doesn't help much unless you already know there's a NotificationItem.pm. But putting the two together gets us NotificationItem.pm, sub as_html. Which looks like this:
# returns contents of this item for user u
sub as_html {
my $self = shift;
croak "Too many args passed to NotificationItem->as_html" if scalar @_;
return "(Invalid event)" unless $self->event;
return eval { $self->event->content($self->u) } || $@;
}
Uh, I, what? One arg assignment, two lines of error handling, and then one line of code. Well, at least we know where to look next.
sub event {
my $self = shift;
$self->_load unless $self->{_loaded};
return $self->{event};
}
...
Yeah. I had nothing better to do tonight anyway, in case you were wondering. Sub _load isn't much more helpful. Unless you care about the poor singleton qids. I don't. So let's try the second part of event->content. The content bit. Seems like a fair bet.
NotificationItem, of course, has no sub content. That would be too easy! It has an event, that has content, clearly. So what's an event? And what else does it have? Looking a little lower, we see it also has content_summary. So what are we looking at here? Notification of an event, that has content. Huh. Notification, specifically, of a journal post (that's our event). Time for a little more brute force and ignorance....
dh-tyggerjai@hack:~/dw$ rgrep "sub content" cgi-bin/*
cgi-bin/DW/User/ContentFilters.pm:sub content_filters {
cgi-bin/DW/Request/Apache2.pm:sub content_type {
cgi-bin/DW/Request/Apache2.pm:sub content {
cgi-bin/DW/Request/Standard.pm:sub content_type {
cgi-bin/DW/Request/Standard.pm:sub content {
cgi-bin/LJ/Event/NewUserpic.pm:sub content {
cgi-bin/LJ/Event/NewUserpic.pm:sub content_summary {
cgi-bin/LJ/Event/RemovedFromCircle.pm:sub content {
...
That's a little long to be useful, but there's a pattern. Things in LJ/Event have content and content_summary. Ooooh. Getting warm.
dh-tyggerjai@hack:~/dw$ rgrep "sub content" cgi-bin/LJ/Event/*
cgi-bin/LJ/Event/AddedToCircle.pm:sub content {
cgi-bin/LJ/Event/Birthday.pm:sub content {
cgi-bin/LJ/Event/CommunityInvite.pm:sub content {
cgi-bin/LJ/Event/CommunityJoinRequest.pm:sub content {
cgi-bin/LJ/Event/ImportStatus.pm:sub content {
cgi-bin/LJ/Event/ImportStatus.pm:sub content_summary {
cgi-bin/LJ/Event/InvitedFriendJoins.pm:sub content {
cgi-bin/LJ/Event/JournalNewComment.pm:sub content {
cgi-bin/LJ/Event/JournalNewComment.pm:sub content_summary {
cgi-bin/LJ/Event/JournalNewEntry.pm:sub content {
cgi-bin/LJ/Event/JournalNewEntry.pm:sub content_summary {
...
Ahhh. So this is a list of events, about which we might be notified. The specific one we want is, of course, JournalNewEntry.
sub content {
my ($self, $target) = @_;
my $entry = $self->entry;
return undef unless $self->_can_view_content( $entry, $target );
return $entry->event_html( {
# double negatives, ouch!
ljcut_disable => ! $target->cut_inbox,
cuturl => $entry->url } )
. $self->as_html_actions;
}
To cut a very long story slightly less long, that ends up being the function we want. It gets the html description of the event, wraps it in actions ("Reply" and "Link"), and then sends it back. All we have to do now is inject our tags between the html and the actions, and we're done. Which is nice, because it's 3 am....
no subject
no subject
hxxipbqcrqCKN
(Anonymous) 2011-08-09 03:52 pm (UTC)(link)no subject
We're moving away from this, unfortunately, as we migrate from BML files to DW::Controller modules. It's a good thing in that it comes closer to the MVC ideal of programming by separating logic from presentation, but a bad thing in that it's going to be harder for people to find the code that corresponds to a given URL.
no subject
no subject
no subject
fqwyVHIKoArwBIPcSu
(Anonymous) 2011-08-09 10:07 am (UTC)(link)AbBuxNGbWSPK
(Anonymous) 2011-08-09 04:08 pm (UTC)(link)INCluLfgbzdf
(Anonymous) 2011-08-08 02:33 pm (UTC)(link)sgQcwEBQlXMLBAiW
(Anonymous) 2011-08-09 02:46 pm (UTC)(link)no subject
AOTpbOmOrheXmcrxCCe
(Anonymous) 2011-08-09 03:27 pm (UTC)(link)no subject
no subject
aVscrDFeqxit
(Anonymous) 2011-08-09 04:18 am (UTC)(link)hesQvNtEqI
(Anonymous) 2011-08-09 04:18 am (UTC)(link)no subject
It was fun to read for me too, although I got a bit lost in there somewhere. I'm still coming to grips with the command line and using grep/rgrep to find things... maybe I'll have a look around the code properly sometime when I've got 2 or 3 hours to spare to wrap my head around it all. :)
no subject
no subject
Good idea, but with these things (for me at least) the more I use the command line the more comfortable I become with it; it's a case of practise makes perfect. I have an EeePC and I've become a lot more proficient at using it in the last few months now that I have made a point of trying things from the command line first just see how they can be done.
mGNUMGdyYVHFLZ
(Anonymous) 2011-08-09 09:25 am (UTC)(link)IOmjiWlnHHrcZZ
(Anonymous) 2011-08-09 07:28 am (UTC)(link)no subject
no subject
no subject
no subject
zKyItlMDEMVifeOHEu
(Anonymous) 2011-08-08 04:15 pm (UTC)(link)jLWaXxJdaVCcSusX
(Anonymous) 2011-08-08 11:29 pm (UTC)(link)LFhAUqOJmAneg
(Anonymous) 2011-08-09 06:03 am (UTC)(link)