Thread (71 messages) 71 messages, 4 authors, 2017-11-07

Re: [PATCH 1/4] remote-mediawiki: add namespace support

From: Antoine Beaupré <hidden>
Date: 2017-10-29 18:39:49

On 2017-10-29 13:24:03, Eric Sunshine wrote:
On Sun, Oct 29, 2017 at 12:08 PM, Antoine Beaupré [off-list ref] wrote:
quoted
From: Kevin <redacted>

this introduces a new remote.origin.namespaces argument that is a
s/this/This/
ack.
quoted
space-separated list of namespaces. the list of pages extract is then
s/the/The/
ack.
quoted
taken from all the specified namespaces.

Reviewed-by: Antoine Beaupré <redacted>
Signed-off-by: Antoine Beaupré <redacted>
---
diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -1331,7 +1356,12 @@ sub get_mw_namespace_id {
 sub get_mw_namespace_id_for_page {
        my $namespace = shift;
        if ($namespace =~ /^([^:]*):/) {
This is not a new issue, but why capture if $1 is never referenced in
the code below?
meh, i dunno.
quoted
-               return get_mw_namespace_id($namespace);
+               my ($ns, $id) = split(/:/, $namespace);
+               if (Scalar::Util::looks_like_number($id)) {
+                       return get_mw_namespace_id($ns);
So, the idea is that if the input has form "something:number", then
you want to look up "something" as a namespace name. Anything else
(such as "something:foobar") is not considered a valid page reference.
Right?
frankly, i have no idea what's going on here.
quoted
+               } else{
Missing space before open brace.
right.
quoted
+                       return
Not required, but missing semi-colon.
ok.
quoted
+               }
        } else {
                return;
        }
The multiple 'return's are a bit messy. Perhaps collapse the entire
function to something like this:

    sub get_mw_namespace_id_for_page {
        my $arg = shift;
        if ($arg =~ /^([^:]+):\d+$/) {
            return get_mw_namespace_id($1);
        }
        return undef;
    }

Then, you don't need even need Scalar::Util::looks_like_number()
(unless, I suppose, the incoming number is expected to be something
other than simple digits).

In fact, it may be that the intent of the original code *was* meant to
do exactly the same as shown in my example above, but that the person
who wrote it accidentally typed:

    return get_mw_namespace_id($namespace);

instead of the intended:

    return get_mw_namespace_id($1);

So, a minimal fix would be simply to change $namespace to $1.
Tightening the regex as I did in my example would be a bonus (though
probably ought to be a separate patch).
so while i'm happy to just copy-paste your code in there, that's kind of
a sensitive area of the code, as it was originally used only in the
upload procedure, which I haven't tested at all. so i'm hesitant in just
merging that in as is.

i don't understand why or how this even works, to be honest: page names
don't necessarily look like numbers, in fact, they generally don't. i
don't understand why the patch submitted here even touches that function
at all, considering that the function is only used on uploads. I just
cargo-culted it from the original issue...

sigh.

a.

-- 
C'est trop facile quand les guerres sont finies
D'aller gueuler que c'était la dernière
Amis bourgeois vous me faites envie
Ne voyez vous pas donc point vos cimetières?
                        - Jaques Brel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help