Thread (71 messages) 71 messages, 4 authors, 2017-11-07

Re: [PATCH 5/7] remote-mediawiki: support fetching from (Main) namespace

From: Eric Sunshine <hidden>
Date: 2017-11-01 19:56:57

On Sun, Oct 29, 2017 at 10:51 PM, Antoine Beaupré [off-list ref] wrote:
When we specify a list of namespaces to fetch from, by default the MW
API will not fetch from the default namespace, refered to as "(Main)"
in the documentation:

https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces

I haven't found a way to address that "(Main)" namespace when getting
the namespace ids: indeed, when listing namespaces, there is no
"canonical" field for the main namespace, although there is a "*"
field that is set to "" (empty). So in theory, we could specify the
empty namespace to get the main namespace, but that would make
specifying namespaces harder for the user: we would need to teach
users about the "empty" default namespace. It would also make the code
more complicated: we'd need to parse quotes in the configuration.

So we simply override the query here and allow the user to specify
"(Main)" since that is the publicly documented name.
Thanks, this explanation makes the patch a lot clearer. More below...
quoted hunk ↗ jump to hunk
Signed-off-by: Antoine Beaupré <redacted>
---
diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -264,9 +264,14 @@ sub get_mw_tracked_categories {
 sub get_mw_tracked_namespaces {
     my $pages = shift;
     foreach my $local_namespace (@tracked_namespaces) {
-        my $namespace_id = get_mw_namespace_id($local_namespace);
+        my ($namespace_id, $mw_pages);
+        if ($local_namespace eq "(Main)") {
+            $namespace_id = 0;
+        } else {
+            $namespace_id = get_mw_namespace_id($local_namespace);
+        }
I meant to ask this in the previous round, but with the earlier patch
mixing several distinct changes into one, I plumb forgot: Would it
make sense to move this "(Main)" special case into
get_mw_namespace_id() itself? After all, that function is all about
determining an ID associated with a name, and "(Main)" is a name.
         next if $namespace_id < 0; # virtual namespaces don't support allpages
-        my $mw_pages = $mediawiki->list( {
+        $mw_pages = $mediawiki->list( {
Why did the "my" of $my_pages get moved up to the top of the foreach
loop? I can't seem to see any reason for it. Is this an unrelated
change accidentally included in this patch?
             action => 'query',
             list => 'allpages',
             apnamespace => $namespace_id,
--
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help