Re: wip/mk/git-package.mk cleanups

To: tech-pkg%netbsd.org@localhost
Subject: Re: wip/mk/git-package.mk cleanups
From: Aleksej Saushev <asau%inbox.ru@localhost>
Date: Thu, 08 May 2014 02:23:10 +0400

  Hello,

Richard Hansen <rhansen%bbn.com@localhost> writes:

> On 2014-05-06 16:21, Aleksej Saushev wrote:
>>> I believe that the risks associated with creating the symlinks are lower
>>> than the risks associated with overriding WRKSRC in every package that
>>> uses {cvs,git,hg,svn}-package.mk.  Note that the symlink and its
>>> referent are siblings in the filesystem hierarchy, so any relative paths
>>> spanning both sides of WRKSRC shouldn't be a problem.
>> 
>> So far the only reason for symlink is setting WRKSRC.
>> Why don't you want to provide default value to it instead?
>
> I did this because a package is allowed to both download a tarball and
> check out sources from a repository.  For example, a hypothetical
> package foo might do the following:
>
>   1. download foo-1.2.3.tar.gz
>   2. extract foo-1.2.3.tar.gz to ${WRKDIR}/foo-1.2.3
>   3. clone url://to/bar.git into ${WRKDIR}/bar
>   4. build and install the sources in ${WRKDIR}/foo-1.2.3
>   5. install ${WRKDIR}/bar/nifty-script to /usr/bin
>
> If I change the default value of WRKSRC, then this package would try to
> build the sources in ${WRKDIR}/bar instead of ${WRKDIR}/foo-1.2.3.
>
> Maybe it's OK to require a package developers to manually reset
> WRKDIR=${WRKSRC}/${DISTNAME} if they both download a tarball and check
> out a repository.  If so, I think this would do what we want:
>
> .if ${CVS_REPOSITORIES:[\#]} == 1
> WRKSRC?=      ${WRKDIR}/${CVS_MODULE.${CVS_REPOSITORIES:[1]}}
> .endif

This is acceptable, in my opinion.

Maybe a comment would be nice, but this is real nit-picking.
I'm fine without it.

> BTW, I just noticed that there is a (minor) bug in my patches:  If a
> package checks out two different types of repositories (e.g., it both
> clones a Git repo and checks out a CVS module) and ${WRKSRC} doesn't
> exist by post-extract, then it's a race to determine which symlink will
> win.  I'm not sure this is worth fixing; the package developer should
> set WRKSRC in this case anyway.

Just don't play with symlinks. Setting WRKSRC is not such a big deal
than working around problems you didn't expect. A lot of packages set
WRKSRC because the content of distributed file doesn't meet expectations.
I don't understand why this is such a big deal to you.

>>> If a symlink causes a problem in a particular package, the package
>>> maintainer can still set WRKSRC (as required without these patches) to
>>> prevent the symlink from being created.
>>>
>>> I'd like to hear more from the community about this issue (especially
>>> alternative suggestions) before I exclude the symlink feature.
>> 
>> It's a hack that is potentially broken in general on at least one major 
>> platform.
>
> Broken how?  On which platform?

At least pax is known to work in an unexpected way(s) when it encounters a 
symlink.
This happens on FreeBSD at least. Because some other systems have borrowed 
(DragonFly)
or might have borrowed FreeBSD's implementation, it may be less an edge case 
than you think.

>> We don't do such a thing in general case.
>> As for me, these are enough reasons not to do this.

>>> --- a/mk/cvs-package.mk
>>> +++ b/mk/cvs-package.mk
>>> @@ -178,17 +178,19 @@ _CVS_DISTFILE.${repo}=        
>>> ${PKGBASE}-${CVS_MODULE.${repo}}-${_CVS_TAG.${repo}}.tar.
>>>  
>>>  #   command to extract cache file
>>>  _CVS_EXTRACT_CACHED.${repo}=       \
>>> -   if [ -f ${_CVS_DISTDIR:Q}/${_CVS_DISTFILE.${repo}:Q} ]; then    \
>>> -     ${STEP_MSG} "Extracting cached CVS archive 
>>> "${_CVS_DISTFILE.${repo}:Q}"."; \
>>> -     gzip -d -c ${_CVS_DISTDIR:Q}/${_CVS_DISTFILE.${repo}:Q} | pax -r;     
>>> \
>>> -     exit 0;                                                       \
>>> +   tarball=${_CVS_DISTFILE.${repo}:Q};                             \
>>> +   fulltarball=${_CVS_DISTDIR:Q}/$$tarball;                        \
>>> +   if [ -f "$$fulltarball" ]; then                                 \
>>> +           ${STEP_MSG} "Extracting cached CVS archive $$tarball."; \
>>> +           gzip -d -c "$$fulltarball" | pax -r;                    \
>>> +           exit 0;                                                 \
>>>     fi
>> 
>> I'm afraid that this change mostly undoes your efforts on proper quoting.
>
> How so?  As far as I can tell everything here is sufficiently quoted.
>
> Note:  POSIX shells do not subject variable assignments to word
> splitting.  Thus, in the following script, quotes are unnecessary around
> $1 but are necessary around $2 (because bar="$2" is not a variable
> assignment; it's an argument to the export command):
>
>    foo=$1
>    export bar="$2"

While I can agree that the standard may prescribe it, I know that our main 
shells
may not follow the standard close enough. This is our sad reality.
That's why I'm sceptical about it.

On contrary, variation between implementations of make are smaller:
it has to be NetBSD make. As a consequence, the behaviour is defined better. :)

>> Anyway, I find this quoting wars against make and sh not productive enough.
>> This is less an issue than what is described below.
>> 
>>>  #   create cache archive
>>>  _CVS_CREATE_CACHE.${repo}= \
>>> -   ${STEP_MSG} "Creating cached CVS archive 
>>> "${_CVS_DISTFILE.${repo}:Q}"."; \
>>> +   ${STEP_MSG} "Creating cached CVS archive $$tarball.";           \
>>>     ${MKDIR} ${_CVS_DISTDIR:Q};                                     \
>>> -   pax -w ${CVS_MODULE.${repo}:Q} | gzip > 
>>> ${_CVS_DISTDIR:Q}/${_CVS_DISTFILE.${repo}:Q}
>>> +   pax -w ${CVS_MODULE.${repo}:Q} | gzip >$$fulltarball
>>>  .endfor
>> 
>> Because this makes the code hard to track,
>
> If they were inlined it would be easier to read.  :)

If they are inlined, you cannot override them.

>> I suggest that you introduce
>> variables in make rather than in sh.
>
> This defeats the purpose -- the make variables and their expansion
> modifiers would be very long, e.g. ${_CVS_DISTFILE.${repo}:Q} and
> ${_CVS_FULLTARBALL.${repo}:Q} instead of $$tarball and $$fulltarball.
>
> I can scrap this commit, though I prefer not to; I hate working with
> code that has lines longer than 80 characters.

Lines longer than 80 characters are less an issue than untrackable code.
Besides, I'm yet to see at least one reference where 80 columns was
determined as optimal for readability. On contrary, I remember seeing
references where an optimum (one of two in that research) was at 100-120
columns per line (another was at 60-70).

>>> +post-extract: do-git-post-extract
>> 
>> ...and similar.
>> 
>> Because "post-extract" is user-definable, it needs more consideration.
>> It may make the code essentially non-deterministic in the sense
>> that it may break for some but not for others. It is unclear whether
>> this is theoretical or practical as of now. I'd say that you shouldn't use 
>> it.
>
> Note that cvs-package.mk already does:
>
> pre-extract: do-cvs-extract
> do-cvs-extract: .PHONY
>       ...
>
> I would suspect that any argument against do-cvs-post-extract also
> applies to do-cvs-extract.  Given that do-cvs-extract seems to work OK,
> I don't see a problem with do-cvs-post-extract.
>
> Also note that do-cvs-post-extract is a prerequisite to post-extract, so
> it is guaranteed to run before the user-definable post-extract rule.

I'd say that pre-extract is less frequently used (if used at all,
I don't remember any package doing that). "post-extract" is used
frequently enough to be noticable. I would be cautious in reusing it.

>>> -GIT_MODULE.${repo}?=       ${repo}
>>> +# follow the same algorithm that 'git clone' does when "humanizing"
>>> +# the clone URL.  first, start with the URL
>>> +_GIT_MODULE_D_0.${repo}=   ${GIT_REPO.${repo}}
>> 
>> I don't like it, but alright. Still, because this code is really unclear
>> and messy, please, provide a reference to git documentation where this
>> algorithm is described.
>
> It's not documented -- the algorithm is only in source code.  Maybe I'll
> scrap this commit to be consistent with the others.

Then the reference is even more important.
A reference to a source code? I cannot say that it is optimal,
but if there's nothing else...


-- 
HE CE3OH...

References:
- wip/mk/git-package.mk cleanups
  - From: Richard Hansen
- Re: wip/mk/git-package.mk cleanups
  - From: Aleksej Saushev
- Re: wip/mk/git-package.mk cleanups
  - From: Richard Hansen
- Re: wip/mk/git-package.mk cleanups
  - From: Aleksej Saushev
- Re: wip/mk/git-package.mk cleanups
  - From: Richard Hansen
- Re: wip/mk/git-package.mk cleanups
  - From: Aleksej Saushev
- Re: wip/mk/git-package.mk cleanups
  - From: Richard Hansen

Prev by Date: daily pkgsrc CVS update output
Next by Date: Re: wip/mk/git-package.mk cleanups
Previous by Thread: Re: wip/mk/git-package.mk cleanups
Next by Thread: Re: wip/mk/git-package.mk cleanups
Indexes:

Home | Main Index | Thread Index | Old Index