Thread (71 messages) 71 messages, 4 authors, 2017-11-07

Re: [PATCH 4/4] remote-mediawiki: allow using (Main) as a namespace and skip special namespaces

From: Antoine Beaupré <hidden>
Date: 2017-10-30 02:11:07

On 2017-10-29 15:49:28, Eric Sunshine wrote:
On Sun, Oct 29, 2017 at 12:08 PM, Antoine Beaupré [off-list ref] wrote:
quoted
Subject: remote-mediawiki: allow using (Main) as a namespace and skip special namespaces
This patch is more difficult to review than it perhaps ought to be
since it is making multiple unrelated changes.

It's not clear from the description what special namespaces are and
why they need to be skipped. It's also not clear why (Main) is
special. Perhaps the commit message(s) could explain these issues in
more detail.

To simplify review and make it easier to gauge what it going on, it
might make sense to split this patch into at least two: one which
skips "special namespaces", and one which gives special treatment to
(Main).
Agreed, I'll try to do that.
More below...
quoted
Reviewed-by: Antoine Beaupré <redacted>
Signed-off-by: Antoine Beaupré <redacted>
---
diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -264,16 +264,27 @@ sub get_mw_tracked_categories {
 sub get_mw_tracked_namespaces {
     my $pages = shift;
-    foreach my $local_namespace (@tracked_namespaces) {
-        my $mw_pages = $mediawiki->list( {
-            action => 'query',
-            list => 'allpages',
-            apnamespace => get_mw_namespace_id($local_namespace),
-            aplimit => 'max' } )
-            || die $mediawiki->{error}->{code} . ': '
-                . $mediawiki->{error}->{details} . "\n";
-        foreach my $page (@{$mw_pages}) {
-            $pages->{$page->{title}} = $page;
+    foreach my $local_namespace (sort @tracked_namespaces) {
+        my ($mw_pages, $namespace_id);
+        if ($local_namespace eq "(Main)") {
+            $namespace_id = 0;
+        } else {
+            $namespace_id = get_mw_namespace_id($local_namespace);
+        }
+        if ($namespace_id >= 0) {
This may be problematic since get_mw_namespace_id() may return undef
rather than a number, in which case Perl will complain. Since the code
skips the $mediawiki query altogether when it encounters "(Main)", you
could fix this problem and simplify the code overall by simply
skipping the bulk of the foreach loop body instead of mucking around
with $namespace_id. For instance:

    foreach my $local_namespace (sort @tracked_namespaces) {
        next if ($local_namespace eq "(Main)");
        ...normal processing...
    }
Ah yes. I see your point but it doesn't actually skip the query when it
encouters main ($namespace_id >= 0).
quoted
+            if ($mw_pages = $mediawiki->list( {
+                action => 'query',
+                list => 'allpages',
+                apnamespace => $namespace_id,
+                aplimit => 'max' } )) {
+                print {*STDERR} "$#{$mw_pages} found in namespace $local_namespace ($namespace_id)\n";
The original code did not emit this diagnostic but the new code does
so unconditionally. Is this just leftover debugging code or is
intended that all users should see this information all the time?
This is a known issue that permeates the whole remote at this point, and
it is quite annoying.

https://github.com/Git-Mediawiki/Git-Mediawiki/issues/30

I have, however, considered it useful to include this to show progress
as it can take a while to fetch all namespace information...

Obviously, once we figure out how to silence this stuff (ie. how to
recognize -q), it should be silenced like everything else, but until
then I think it's quite useful.
quoted
+                foreach my $page (@{$mw_pages}) {
+                    $pages->{$page->{title}} = $page;
+                }
+            } else {
+                warn $mediawiki->{error}->{code} . ': '
+                    . $mediawiki->{error}->{details} . "\n";
I guess this is the part which "skips special namespaces". The
original code die()'d but this merely warns. Aside from these "special
namespaces", are there genuine cases when the $mediawiki query would
return an error, and which should indeed die(), or is warning
appropriate for all $mediawiki query error cases?
Maybe I didn't get the indentation right, but this } else { is for query
failures, *not* the if ($namespace_id < 0). So < 0 is just silently
skipped.

The original code was die()'ing on failures, but I think that's a
mistake: we should fetch what we can and warn on the failures. That
allows the user to fix multiple problems at once instead of having to
rerun the script repeatedly.

A.

-- 
Le féminisme n'a jamais tué personne
Le machisme tue tous les jours.
                        - Benoîte Groulx
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help